C++ Data Races: When to Use Atomics Instead of Mutexes (Interview Guide)
이 글의 핵심
Practical guide to C++ data races, mutexes, atomics, deadlocks, and CAS for interviews and production.
Introduction: “When should I use atomics instead of a mutex?”
Series #7 compressed for interviews
The guides C++ in practice #7-2: mutex and synchronization and #7-4: atomics and the memory model cover the basics. This article prepares you for follow-up interview questions: the definition of a data race, mutex vs atomic trade-offs, deadlock fixes, and CAS.
What this article covers:
- Data race (C++ standard definition) and why synchronization exists
- Mutex: critical section, one thread at a time
- Atomic: single-variable atomic operations without a lock
- Mutex vs atomic — how to choose
- Deadlock and fixes (lock ordering,
std::lock) - CAS (Compare-And-Swap: atomically update memory only if it matches an expected value; used in lock-free algorithms)
After reading:
- You can state the precise definition of a data race and recognize real bugs.
- You can choose between mutex and atomic appropriately.
- You can apply practical patterns to avoid deadlocks.
- You understand CAS and
compare_exchange_*.
Table of contents
- What is a data race?
- Problem scenarios: real data races
- Role and limits of a mutex
- Role and limits of atomics
- Mutex vs atomic: how to choose
- Deadlocks and fixes
- CAS overview
- Common mistakes
- Performance: mutex vs atomic
- Production patterns
- Interview Q&A
1. What is a data race?
Definition in the C++ standard
- A data race occurs when two threads access the same memory location, at least one is a write, the accesses are not ordered by a happens-before relation from synchronization, and they are not both atomic according to the rules.
- A program with a data race has undefined behavior in C++. “Sometimes wrong” often comes from data races.
Synchronization tools
- Mutex: only the thread holding the lock enters the protected region; shared accesses do not overlap.
- Atomic: reads/writes/read-modify-writes on one variable are atomic, removing data races on that variable for those operations.
- Condition variables, etc.: coordinate waiting and notification.
In interviews you can say: “A data race is unsynchronized conflicting access on the same location; it’s UB. Use mutexes, atomics, or other synchronization to prevent it.”
2. Problem scenarios: real data races
Scenario 1: Broken counter
Several worker threads increment one shared counter; after batch processing, totals diverge from the database by tens of thousands.
// Bad: counter++ is not atomic
#include <thread>
#include <iostream>
int counter = 0;
void increment() {
for (int i = 0; i < 100000; ++i) {
counter++; // read-modify-write is not atomic → data race
}
}
int main() {
std::thread t1(increment);
std::thread t2(increment);
t1.join();
t2.join();
std::cout << "counter = " << counter << "\n"; // not 200000!
return 0;
}
Why it breaks: counter++ splits into load, add, store. If two threads interleave after reading the same value, one update can be lost.
sequenceDiagram
participant T1 as Thread 1
participant Mem as Memory
participant T2 as Thread 2
Mem->>T1: read: 0
Mem->>T2: read: 0
T1->>T1: +1 → 1
T2->>T2: +1 → 1
T1->>Mem: write: 1
T2->>Mem: write: 1
Note over Mem: result 1 (expected 2)
Scenario 2: Inventory bug (check-then-act)
E-commerce: concurrent purchases can drive stock negative.
int stock = 100;
void purchase(int quantity) {
if (stock >= quantity) {
// Another thread may run here
stock -= quantity;
}
}
Cause: The check and update are not atomic—classic check-then-act. Protect both with one mutex.
Scenario 3: Flag/data mismatch
One thread writes data then sets a flag; another reads the flag then data. The compiler or CPU may reorder stores so the flag becomes visible before the data.
bool ready = false;
int data = 0;
void producer() {
data = 42;
ready = true; // reordering possible
}
void consumer() {
while (!ready)
;
std::cout << data << "\n"; // may still see 0
}
Fix: std::atomic<bool> ready and atomic data, or protect both with a mutex.
Scenario 4: Broken double-checked locking
Naïve DCLP before C++11 can publish a pointer before the object is fully constructed.
Fix in modern C++: std::call_once or Meyers’ singleton with a static local variable.
Scenario 5: Caches and visibility
Without synchronization, a write on one core may not be visible immediately to another core’s cache. Atomics/mutexes insert the needed memory ordering for visibility.
3. Role and limits of a mutex
Role
- Ensures only one thread at a time runs the critical section.
- Good when multiple variables or complex invariants must be updated together (maps, queues).
sequenceDiagram
participant T1 as Thread 1
participant M as mutex
participant T2 as Thread 2
T1->>M: lock()
M-->>T1: acquired
T2->>M: lock()
Note over T2: waiting
T1->>M: critical section
T1->>M: unlock()
M-->>T2: acquired
T2->>M: critical section
Protecting a counter with a mutex
#include <mutex>
#include <thread>
#include <iostream>
int counter = 0;
std::mutex mtx;
void increment() {
for (int i = 0; i < 100000; ++i) {
std::lock_guard<std::mutex> lock(mtx);
counter++;
}
}
unique_lock and try_lock
Use unique_lock with try_to_lock or timed_mutex::try_lock_for for conditional or timed locking.
Mutex types (summary)
| Mutex | Use |
|---|---|
std::mutex | Default mutual exclusion |
std::recursive_mutex | Same thread may lock multiple times |
std::timed_mutex | try_lock_for / try_lock_until |
std::shared_mutex (C++17) | Many readers, one writer |
Limits
- Holding locks too long serializes work and hurts throughput.
- Multiple locks with inconsistent order → deadlock.
4. Role and limits of atomics
Role
- Makes reads/writes/RMWs on one variable atomic without a mutex.
- Often cheaper than a mutex for simple hot counters under contention.
Limits
- One variable only—invariants spanning multiple fields need a mutex (or a carefully designed lock-free structure).
- Splitting a logical update into multiple atomic ops can still race.
Counter with std::atomic
#include <atomic>
#include <thread>
#include <iostream>
std::atomic<int> counter{0};
void increment() {
for (int i = 0; i < 100000; ++i) {
counter.fetch_add(1);
}
}
memory_order (interview cheat sheet)
Default is seq_cst. Use acquire/release for handoff synchronization; relaxed only for independent counters where no cross-variable ordering is required.
5. Mutex vs atomic: how to choose
Prefer a mutex when
- You must update multiple variables together.
- Read-check-write must be one atomic step logically.
- You modify complex containers concurrently.
Prefer atomics when
- A single variable (counter, flag, pointer) suffices.
- You want to reduce lock overhead on a hot counter.
flowchart TD
subgraph choose["Mutex vs atomic"]
A[What are you protecting?] --> B{Single variable?}
B -->|Yes| C{Simple op?}
C -->|Yes| D[Atomic]
C -->|No| E{Complex condition?}
E -->|Yes| F[Mutex]
B -->|No| F
D --> G[Counters, flags]
F --> H[Maps, queues, multi-field updates]
end
One-liner for interviews: Use atomics for one variable and simple RMW; use a mutex for multiple fields or complex conditions.
6. Deadlocks and fixes
What a deadlock is
Two threads wait forever for locks the other holds (e.g., T1 holds A and waits for B while T2 holds B and waits for A).
sequenceDiagram
participant T1 as Thread 1
participant A as Lock A
participant B as Lock B
participant T2 as Thread 2
T1->>A: lock() ✓
T2->>B: lock() ✓
T1->>B: lock() ... waiting
T2->>A: lock() ... waiting
Note over T1,T2: Deadlock
Fix 1: fixed lock order
Always acquire A then B in every thread so circular wait cannot occur.
void thread1() {
std::lock_guard<std::mutex> lockA(mutexA);
std::lock_guard<std::mutex> lockB(mutexB);
}
void thread2() {
std::lock_guard<std::mutex> lockA(mutexA); // same order
std::lock_guard<std::mutex> lockB(mutexB);
}
Fix 2: std::lock
Acquire two or more mutexes atomically without deadlock, then adopt ownership:
// g++ -std=c++17 -pthread -o deadlock_avoid deadlock_avoid.cpp && ./deadlock_avoid
#include <mutex>
#include <thread>
#include <iostream>
std::mutex mutexA, mutexB;
void useBothLocks() {
std::lock(mutexA, mutexB);
std::lock_guard<std::mutex> lockA(mutexA, std::adopt_lock);
std::lock_guard<std::mutex> lockB(mutexB, std::adopt_lock);
std::cout << "Both locks acquired\n";
}
int main() {
std::thread t1(useBothLocks);
std::thread t2(useBothLocks);
t1.join();
t2.join();
std::cout << "Done\n";
return 0;
}
7. CAS (Compare-And-Swap)
Concept
CAS atomically writes a new value only if the current value equals expected; otherwise it leaves the value unchanged and updates expected for retry loops.
In C++
Use std::atomic::compare_exchange_strong / compare_exchange_weak.
// Lock-free spinlock sketch
#include <atomic>
std::atomic<bool> lock_flag{false};
void lock() {
bool expected = false;
while (!lock_flag.compare_exchange_strong(expected, true)) {
expected = false;
}
}
void unlock() {
lock_flag.store(false);
}
Strong vs weak: weak may spuriously fail on some ISAs—fine in retry loops; strong is simpler for single-shot checks.
8. Production patterns
Pattern 1: thread-safe Meyers’ singleton (C++11)
class Config {
public:
static Config& instance() {
static Config cfg; // thread-safe one-time init
return cfg;
}
Config(const Config&) = delete;
Config& operator=(const Config&) = delete;
private:
Config() = default;
};
Pattern 2: std::call_once for heavy initialization
#include <mutex>
std::once_flag init_flag;
Database* db = nullptr;
void initDatabase() {
std::call_once(init_flag, []{
db = new Database("connection_string");
db->connect();
});
}
Pattern 3: producer–consumer queue (mutex + condition variable)
Use std::mutex, std::condition_variable, and notify_one() outside the lock when possible for performance.
Pattern 4: read-mostly cache with shared_mutex (C++17)
Many concurrent readers, rare writers: std::shared_lock for reads, std::unique_lock for writes.
Pattern 5: per-thread counters + periodic merge
Accumulate in thread_local counters, then fetch_add into a global atomic with memory_order_relaxed to reduce contention.
Pattern 6: graceful shutdown flag
std::atomic<bool> shutdown_requested{false};
void workerThread() {
while (!shutdown_requested.load(std::memory_order_acquire)) {
doWork();
}
}
void requestShutdown() {
shutdown_requested.store(true, std::memory_order_release);
}
9. Common mistakes
- Atomic flag + non-atomic payload still data-races—make both atomic or use one mutex.
- Check-then-act split across atomics—use CAS or a mutex.
- I/O or network calls under a mutex—copy out, release lock, then I/O.
- Inconsistent lock order across threads.
- Recursive lock on
std::mutex—usestd::recursive_mutexif you must re-enter. - Missing unlock on early return—prefer RAII (
lock_guard/unique_lock). - No ThreadSanitizer in CI—add
-fsanitize=threadbuilds.
g++ -std=c++17 -fsanitize=thread -g -o race_test race_test.cpp
./race_test
10. Performance: mutex vs atomic
Rough expectations (machine dependent): a hot atomic counter can be several times faster than the same logic guarded by a mutex under heavy contention; when contention is low and sections are tiny, mutex overhead may be small—always benchmark your workload.
Tips: minimize lock scope; consider shared_mutex for read-heavy paths; use cache-line padding (alignas(64)) to avoid false sharing (see #34-2).
11. Interview Q&A
Q: What is a data race?
Conflicting unsynchronized accesses on the same location where at least one is a write → UB; fix with mutex/atomic/etc.
Q: When use atomics instead of mutex?
Single variable, simple RMW; otherwise mutex.
Q: Avoid deadlocks?
Single lock order or std::lock for multiple mutexes.
Q: What is CAS?
Atomic compare-and-swap; compare_exchange_* in C++.
Q: When tune memory_order?
Start with defaults; relax only after profiling shows atomic overhead matters—use relaxed mainly for standalone counters.
Related posts (internal links)
- C++ lock-free: CAS, ABA, memory orders (#34-3)
- Multithreaded Asio: data races (#2)
- C++ atomics deep dive (#7-4)
Keywords
data race, mutex vs atomic, C++ synchronization, deadlock, compare_exchange, memory_order, ThreadSanitizer
Summary
- Data race → UB; synchronize with mutex or atomic (or other facilities).
- Mutex protects multi-statement invariants; atomics protect single variables efficiently when appropriate.
- Deadlock → consistent ordering or
std::lock. - CAS implements conditional atomic updates for lock-free structures.
Next: Cache line alignment & padding (#34-2)
Previous: shared_ptr circular references (#33-4)
FAQ (extended)
Q. When does this matter in practice?
A. Any multithreaded C++: shared counters, flags, maps/queues, and interview questions on UB, mutex vs atomic, deadlocks, and CAS.
Q. What should I read first?
A. Follow the series index: C++ series index. #7-2 mutex and #7-4 atomic help before this article.
Q. Go deeper?
A. See cppreference for std::atomic and std::mutex, and learn memory models / lock-free after mastering the basics here.
Practical tips
Debugging
- Enable warnings; reproduce with small tests.
- Use ThreadSanitizer in CI.
Performance
- Profile before micro-optimizing
memory_order.
Code review
- Check every shared mutable location for synchronization.
Checklists
Before coding
- Is this the right tool for the problem?
- Can teammates maintain it?
While coding
- Warnings clean?
- Edge cases covered?
Review
- Intent clear?
- Tests sufficient?