Why are Asio deadlocks harder than synchronous ones?

Callback chains, thread pools, and strands hide lock order. Draw which mutex each completion handler takes.

Mutex on io_context threads—never?

Not never—but if two async handlers take the same mutex in different orders, you can deadlock. Prefer strands to serialize per connection.

post vs dispatch and deadlocks?

Dispatch may run immediately on the same thread; post queues. Re-entrancy and lock ordering differ—design carefully.

Timeouts, logging, thread dumps, TSan stress tests.

Asio Deadlock Debugging: Async Callbacks, Locks, and Strands [#49-3]

2026년 3월 12일 · 22분 읽기 · 수정 2026년 3월 12일 Advanced Troubleshooting

이 글의 핵심

Why Asio deadlocks are subtle: callbacks, thread pools, and implicit lock order. Use strands, avoid waiting under locks, and debug with gdb thread apply all bt.

Introduction: “The Asio server sometimes hangs”

Asio servers can deadlock when a mutex is held while waiting for another async operation whose completion handler needs the same mutex. Multi-threaded io_context::run() makes this timing-dependent and “hidden.”

Topics:

Pattern: lock → *async_ ** → cv.wait while handler needs the lock
Different lock orders across threads
Strands, minimal lock scope, uniform ordering
gdb thread apply all bt, logging, TSan

See also: Multithreaded Asio, Strand.

Scenarios (short)

Chat server: sync wait for async_write completion under a lock.
Session pool: lock A then B vs B then A.
HTTP proxy: blocking wait for upstream under lock.
Async logging flush under lock.
Timer vs I/O callbacks taking locks in opposite order.

Pattern 1: Lock held while waiting for completion

// Dangerous (deadlock)
void on_send() {
    std::unique_lock<std::mutex> lock(mtx);
    socket.async_write(..., [&](...) {
        std::lock_guard<std::mutex> lk(mtx);  // may block forever
        done = true;
        cv.notify_one();
    });
    cv.wait(lock, [&] { return done; });
}

sequenceDiagram
    participant T1 as Thread 1 (on_send)
    participant T2 as Thread 2 (io.run)
    participant Mtx as mutex
    T1->>Mtx: lock
    T1->>T2: async_write scheduled
    T1->>T1: cv.wait (holds mtx)
    T2->>Mtx: try lock in handler → blocks
    Note over T1,T2: Deadlock

Fix: Continue work in the completion handler, or post to a strand; do not wait on async completion while holding the mutex the handler needs.

Pattern 2: Lock order inversion

Thread A: mtx1 then mtx2. Thread B: mtx2 then mtx1. → Cycle.

Fix: Global order for all mutexes, or std::lock / std::scoped_lock to acquire both atomically.

Solutions: strand, small critical sections, ordering

Per-connection strand serializes handlers for that connection—often no mutex for session state.

boost::asio::bind_executor(strand_, [self = shared_from_this()](auto ec, auto n) {
    self->on_write_done();
});

Rules:

Do not wait for async completion while holding locks the handler needs.
If multiple mutexes: fixed order or std::scoped_lock.

Debugging

Hang + ~0% CPU → suspect deadlock.
gdb -p <pid> → thread apply all bt full
Look for pthread_mutex_lock, cond_wait, and cross-thread cycles.

gdb -p <pid> -batch -ex "thread apply all bt"

TSan (-fsanitize=thread) may report lock-order inversion.

Production patterns

Document lock hierarchy (e.g. session → cache → log).
Strand-first design for connection state.
Optional watchdog timer if progress stalls.
SIGUSR1 handler for stack dump (limited; use gdb for all threads).

Checklist

No cv.wait under mutex also taken in async handlers for the same operation.
Consistent lock order or std::scoped_lock
Per-session strand where possible
Small lock scope; no unknown callbacks under lock

Multithreaded Asio data race
Composed operations
Asio intro

FAQ

Q. When to use this?
A. Any multi-threaded Asio app with mutexes + condition variables + async I/O.

Q. Next reads?
A. Series index, strand and executor docs.

Summary

Deadlock: lock + wait for async completion whose handler needs the same lock.
Fix: async chaining, strands, lock ordering, std::scoped_lock.
Debug: all-thread backtraces, logging, TSan.

Previous: CMake link errors (#49-2)

Related: High-performance networking guide index

Segfault debugging
CMake link errors
Load balancer