C++ Data Races: When to Use Atomics Instead of Mutexes (Interview Guide)

C++ Data Races: When to Use Atomics Instead of Mutexes (Interview Guide)

이 글의 핵심

Practical guide to C++ data races, mutexes, atomics, deadlocks, and CAS for interviews and production.

Introduction: “When should I use atomics instead of a mutex?”

Series #7 compressed for interviews

The guides C++ in practice #7-2: mutex and synchronization and #7-4: atomics and the memory model cover the basics. This article prepares you for follow-up interview questions: the definition of a data race, mutex vs atomic trade-offs, deadlock fixes, and CAS.

What this article covers:

  • Data race (C++ standard definition) and why synchronization exists
  • Mutex: critical section, one thread at a time
  • Atomic: single-variable atomic operations without a lock
  • Mutex vs atomic — how to choose
  • Deadlock and fixes (lock ordering, std::lock)
  • CAS (Compare-And-Swap: atomically update memory only if it matches an expected value; used in lock-free algorithms)

After reading:

  • You can state the precise definition of a data race and recognize real bugs.
  • You can choose between mutex and atomic appropriately.
  • You can apply practical patterns to avoid deadlocks.
  • You understand CAS and compare_exchange_*.

Table of contents

  1. What is a data race?
  2. Problem scenarios: real data races
  3. Role and limits of a mutex
  4. Role and limits of atomics
  5. Mutex vs atomic: how to choose
  6. Deadlocks and fixes
  7. CAS overview
  8. Common mistakes
  9. Performance: mutex vs atomic
  10. Production patterns
  11. Interview Q&A

1. What is a data race?

Definition in the C++ standard

  • A data race occurs when two threads access the same memory location, at least one is a write, the accesses are not ordered by a happens-before relation from synchronization, and they are not both atomic according to the rules.
  • A program with a data race has undefined behavior in C++. “Sometimes wrong” often comes from data races.

Synchronization tools

  • Mutex: only the thread holding the lock enters the protected region; shared accesses do not overlap.
  • Atomic: reads/writes/read-modify-writes on one variable are atomic, removing data races on that variable for those operations.
  • Condition variables, etc.: coordinate waiting and notification.

In interviews you can say: “A data race is unsynchronized conflicting access on the same location; it’s UB. Use mutexes, atomics, or other synchronization to prevent it.”


2. Problem scenarios: real data races

Scenario 1: Broken counter

Several worker threads increment one shared counter; after batch processing, totals diverge from the database by tens of thousands.

// Bad: counter++ is not atomic
#include <thread>
#include <iostream>

int counter = 0;

void increment() {
    for (int i = 0; i < 100000; ++i) {
        counter++;  // read-modify-write is not atomic → data race
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);
    t1.join();
    t2.join();
    std::cout << "counter = " << counter << "\n";  // not 200000!
    return 0;
}

Why it breaks: counter++ splits into load, add, store. If two threads interleave after reading the same value, one update can be lost.

sequenceDiagram
    participant T1 as Thread 1
    participant Mem as Memory
    participant T2 as Thread 2
    Mem->>T1: read: 0
    Mem->>T2: read: 0
    T1->>T1: +1 → 1
    T2->>T2: +1 → 1
    T1->>Mem: write: 1
    T2->>Mem: write: 1
    Note over Mem: result 1 (expected 2)

Scenario 2: Inventory bug (check-then-act)

E-commerce: concurrent purchases can drive stock negative.

int stock = 100;

void purchase(int quantity) {
    if (stock >= quantity) {
        // Another thread may run here
        stock -= quantity;
    }
}

Cause: The check and update are not atomic—classic check-then-act. Protect both with one mutex.

Scenario 3: Flag/data mismatch

One thread writes data then sets a flag; another reads the flag then data. The compiler or CPU may reorder stores so the flag becomes visible before the data.

bool ready = false;
int data = 0;

void producer() {
    data = 42;
    ready = true;  // reordering possible
}

void consumer() {
    while (!ready)
        ;
    std::cout << data << "\n";  // may still see 0
}

Fix: std::atomic<bool> ready and atomic data, or protect both with a mutex.

Scenario 4: Broken double-checked locking

Naïve DCLP before C++11 can publish a pointer before the object is fully constructed.

Fix in modern C++: std::call_once or Meyers’ singleton with a static local variable.

Scenario 5: Caches and visibility

Without synchronization, a write on one core may not be visible immediately to another core’s cache. Atomics/mutexes insert the needed memory ordering for visibility.


3. Role and limits of a mutex

Role

  • Ensures only one thread at a time runs the critical section.
  • Good when multiple variables or complex invariants must be updated together (maps, queues).
sequenceDiagram
    participant T1 as Thread 1
    participant M as mutex
    participant T2 as Thread 2
    T1->>M: lock()
    M-->>T1: acquired
    T2->>M: lock()
    Note over T2: waiting
    T1->>M: critical section
    T1->>M: unlock()
    M-->>T2: acquired
    T2->>M: critical section

Protecting a counter with a mutex

#include <mutex>
#include <thread>
#include <iostream>

int counter = 0;
std::mutex mtx;

void increment() {
    for (int i = 0; i < 100000; ++i) {
        std::lock_guard<std::mutex> lock(mtx);
        counter++;
    }
}

unique_lock and try_lock

Use unique_lock with try_to_lock or timed_mutex::try_lock_for for conditional or timed locking.

Mutex types (summary)

MutexUse
std::mutexDefault mutual exclusion
std::recursive_mutexSame thread may lock multiple times
std::timed_mutextry_lock_for / try_lock_until
std::shared_mutex (C++17)Many readers, one writer

Limits

  • Holding locks too long serializes work and hurts throughput.
  • Multiple locks with inconsistent order → deadlock.

4. Role and limits of atomics

Role

  • Makes reads/writes/RMWs on one variable atomic without a mutex.
  • Often cheaper than a mutex for simple hot counters under contention.

Limits

  • One variable only—invariants spanning multiple fields need a mutex (or a carefully designed lock-free structure).
  • Splitting a logical update into multiple atomic ops can still race.

Counter with std::atomic

#include <atomic>
#include <thread>
#include <iostream>

std::atomic<int> counter{0};

void increment() {
    for (int i = 0; i < 100000; ++i) {
        counter.fetch_add(1);
    }
}

memory_order (interview cheat sheet)

Default is seq_cst. Use acquire/release for handoff synchronization; relaxed only for independent counters where no cross-variable ordering is required.


5. Mutex vs atomic: how to choose

Prefer a mutex when

  • You must update multiple variables together.
  • Read-check-write must be one atomic step logically.
  • You modify complex containers concurrently.

Prefer atomics when

  • A single variable (counter, flag, pointer) suffices.
  • You want to reduce lock overhead on a hot counter.
flowchart TD
    subgraph choose["Mutex vs atomic"]
        A[What are you protecting?] --> B{Single variable?}
        B -->|Yes| C{Simple op?}
        C -->|Yes| D[Atomic]
        C -->|No| E{Complex condition?}
        E -->|Yes| F[Mutex]
        B -->|No| F
        D --> G[Counters, flags]
        F --> H[Maps, queues, multi-field updates]
    end

One-liner for interviews: Use atomics for one variable and simple RMW; use a mutex for multiple fields or complex conditions.


6. Deadlocks and fixes

What a deadlock is

Two threads wait forever for locks the other holds (e.g., T1 holds A and waits for B while T2 holds B and waits for A).

sequenceDiagram
    participant T1 as Thread 1
    participant A as Lock A
    participant B as Lock B
    participant T2 as Thread 2
    T1->>A: lock() ✓
    T2->>B: lock() ✓
    T1->>B: lock() ... waiting
    T2->>A: lock() ... waiting
    Note over T1,T2: Deadlock

Fix 1: fixed lock order

Always acquire A then B in every thread so circular wait cannot occur.

void thread1() {
    std::lock_guard<std::mutex> lockA(mutexA);
    std::lock_guard<std::mutex> lockB(mutexB);
}

void thread2() {
    std::lock_guard<std::mutex> lockA(mutexA);  // same order
    std::lock_guard<std::mutex> lockB(mutexB);
}

Fix 2: std::lock

Acquire two or more mutexes atomically without deadlock, then adopt ownership:

// g++ -std=c++17 -pthread -o deadlock_avoid deadlock_avoid.cpp && ./deadlock_avoid
#include <mutex>
#include <thread>
#include <iostream>

std::mutex mutexA, mutexB;

void useBothLocks() {
    std::lock(mutexA, mutexB);
    std::lock_guard<std::mutex> lockA(mutexA, std::adopt_lock);
    std::lock_guard<std::mutex> lockB(mutexB, std::adopt_lock);
    std::cout << "Both locks acquired\n";
}

int main() {
    std::thread t1(useBothLocks);
    std::thread t2(useBothLocks);
    t1.join();
    t2.join();
    std::cout << "Done\n";
    return 0;
}

7. CAS (Compare-And-Swap)

Concept

CAS atomically writes a new value only if the current value equals expected; otherwise it leaves the value unchanged and updates expected for retry loops.

In C++

Use std::atomic::compare_exchange_strong / compare_exchange_weak.

// Lock-free spinlock sketch
#include <atomic>

std::atomic<bool> lock_flag{false};

void lock() {
    bool expected = false;
    while (!lock_flag.compare_exchange_strong(expected, true)) {
        expected = false;
    }
}

void unlock() {
    lock_flag.store(false);
}

Strong vs weak: weak may spuriously fail on some ISAs—fine in retry loops; strong is simpler for single-shot checks.


8. Production patterns

Pattern 1: thread-safe Meyers’ singleton (C++11)

class Config {
public:
    static Config& instance() {
        static Config cfg;  // thread-safe one-time init
        return cfg;
    }
    Config(const Config&) = delete;
    Config& operator=(const Config&) = delete;
private:
    Config() = default;
};

Pattern 2: std::call_once for heavy initialization

#include <mutex>

std::once_flag init_flag;
Database* db = nullptr;

void initDatabase() {
    std::call_once(init_flag, []{
        db = new Database("connection_string");
        db->connect();
    });
}

Pattern 3: producer–consumer queue (mutex + condition variable)

Use std::mutex, std::condition_variable, and notify_one() outside the lock when possible for performance.

Pattern 4: read-mostly cache with shared_mutex (C++17)

Many concurrent readers, rare writers: std::shared_lock for reads, std::unique_lock for writes.

Pattern 5: per-thread counters + periodic merge

Accumulate in thread_local counters, then fetch_add into a global atomic with memory_order_relaxed to reduce contention.

Pattern 6: graceful shutdown flag

std::atomic<bool> shutdown_requested{false};

void workerThread() {
    while (!shutdown_requested.load(std::memory_order_acquire)) {
        doWork();
    }
}

void requestShutdown() {
    shutdown_requested.store(true, std::memory_order_release);
}

9. Common mistakes

  1. Atomic flag + non-atomic payload still data-races—make both atomic or use one mutex.
  2. Check-then-act split across atomics—use CAS or a mutex.
  3. I/O or network calls under a mutex—copy out, release lock, then I/O.
  4. Inconsistent lock order across threads.
  5. Recursive lock on std::mutex—use std::recursive_mutex if you must re-enter.
  6. Missing unlock on early return—prefer RAII (lock_guard / unique_lock).
  7. No ThreadSanitizer in CI—add -fsanitize=thread builds.
g++ -std=c++17 -fsanitize=thread -g -o race_test race_test.cpp
./race_test

10. Performance: mutex vs atomic

Rough expectations (machine dependent): a hot atomic counter can be several times faster than the same logic guarded by a mutex under heavy contention; when contention is low and sections are tiny, mutex overhead may be small—always benchmark your workload.

Tips: minimize lock scope; consider shared_mutex for read-heavy paths; use cache-line padding (alignas(64)) to avoid false sharing (see #34-2).


11. Interview Q&A

Q: What is a data race?
Conflicting unsynchronized accesses on the same location where at least one is a write → UB; fix with mutex/atomic/etc.

Q: When use atomics instead of mutex?
Single variable, simple RMW; otherwise mutex.

Q: Avoid deadlocks?
Single lock order or std::lock for multiple mutexes.

Q: What is CAS?
Atomic compare-and-swap; compare_exchange_* in C++.

Q: When tune memory_order?
Start with defaults; relax only after profiling shows atomic overhead matters—use relaxed mainly for standalone counters.


  • C++ lock-free: CAS, ABA, memory orders (#34-3)
  • Multithreaded Asio: data races (#2)
  • C++ atomics deep dive (#7-4)

Keywords

data race, mutex vs atomic, C++ synchronization, deadlock, compare_exchange, memory_order, ThreadSanitizer

Summary

  • Data race → UB; synchronize with mutex or atomic (or other facilities).
  • Mutex protects multi-statement invariants; atomics protect single variables efficiently when appropriate.
  • Deadlock → consistent ordering or std::lock.
  • CAS implements conditional atomic updates for lock-free structures.

Next: Cache line alignment & padding (#34-2)

Previous: shared_ptr circular references (#33-4)


FAQ (extended)

Q. When does this matter in practice?

A. Any multithreaded C++: shared counters, flags, maps/queues, and interview questions on UB, mutex vs atomic, deadlocks, and CAS.

Q. What should I read first?

A. Follow the series index: C++ series index. #7-2 mutex and #7-4 atomic help before this article.

Q. Go deeper?

A. See cppreference for std::atomic and std::mutex, and learn memory models / lock-free after mastering the basics here.


Practical tips

Debugging

  • Enable warnings; reproduce with small tests.
  • Use ThreadSanitizer in CI.

Performance

  • Profile before micro-optimizing memory_order.

Code review

  • Check every shared mutable location for synchronization.

Checklists

Before coding

  • Is this the right tool for the problem?
  • Can teammates maintain it?

While coding

  • Warnings clean?
  • Edge cases covered?

Review

  • Intent clear?
  • Tests sufficient?