C++ Segmentation Fault: Five Causes and Debugging with GDB,
이 글의 핵심
Fix segfaults: null dereference, dangling pointers, stack overflow, buffer overrun, bad casts. Core dumps, GDB/LLDB backtraces, and AddressSanitizer (-fsanitize=address).
Related: [undefined behavior](/en/blog/cpp-error-10-undefined-behavior/ · another segfault walkthrough: [segmentation fault (checklist)](/en/blog/cpp-error-27-segmentation-fault/.
Introduction: “Segmentation fault (core dumped)”
A segmentation fault (often called a segfault) is a runtime failure: the operating system or CPU reports an invalid access to the process’s virtual address space, and the runtime typically delivers SIGSEGV (signal number 11 on Linux and other POSIX systems). The message you see in the shell:
Segmentation fault (core dumped)
reflects that the default disposition for an unhandled SIGSEGV is to terminate the process, optionally leaving a core file (when ulimit and core_pattern allow it) that a debugger can load for post‑mortem analysis.
This article is an expanded guide in the C++ error series. It covers:
- What a segfault is at the hardware/OS level (virtual memory, protection, and signal 11).
- Common causes with minimal reproducible C++ and concrete fixes: null dereference, use‑after‑free, buffer overflow, stack overflow, uninitialized pointers, and related UB that often manifests as a crash.
- A practical debugging workflow: core dumps, GDB and LLDB, Valgrind memcheck, and AddressSanitizer.
- Prevention (smart pointers, bounds checking, static analysis) and the distinction between stack and heap failures.
- Multithreaded crashes (races) and how they differ from single‑threaded access violations.
- Platform notes (Linux, macOS, Windows/WSL) and a tool comparison plus CI and production considerations.
- Case studies and a debugging checklist (the kind that only comes from time in the field).
Environments: Examples assume GCC/Clang on Linux or macOS. On Windows, use MSVC with Address Sanitizer or /RTC where applicable, or develop under WSL2 for a Linux‑like toolchain.
Table of contents
- What is a segmentation fault?
- Memory protection and virtual addresses
- SIGSEGV and signal 11
- Common causes (overview)
- Cause: null pointer dereference
- Cause: use‑after‑free and dangling pointers
- Cause: buffer overflow and out‑of‑bounds access
- Cause: stack overflow
- Cause: uninitialized pointers and indeterminate values
- Optional: bad casts and object lifetime
- Stack vs heap segfaults
- Multithreaded segfaults and data races
- Debugging workflow: core dumps
- GDB: backtrace, frame, and inspection
- LLDB: equivalent commands
- Valgrind memcheck
- AddressSanitizer (ASan) setup
- Platform differences: Linux, macOS, Windows
- Tools comparison: GDB vs Valgrind vs ASan
- CI integration with sanitizers
- Production: core collection and process limits
- Prevention: smart pointers and bounds
- What I wish I knew (personal regrets)
- My debugging checklist (evolved over 5 years)
- Case studies: The Midnight Segfault and It Only Crashes on Friday
- Quick reference: symptoms in prose
- Related posts
- Keywords
What is a segmentation fault?
A segfault is not a C++ language exception. It is an OS/runtime event: the process attempted to read, write, or execute from a virtual address that is not mapped, is mapped with incompatible permissions, or is otherwise disallowed. On POSIX systems this usually surfaces as SIGSEGV; the C runtime maps that to the shell message Segmentation fault.
Typical user‑visible triggers include:
- Dereferencing null or a trash pointer value.
- Accessing freed heap memory (use‑after‑free) or a dangling pointer to stack storage.
- Buffer overruns that corrupt control data or land on a guard page.
- Stack overflow (often infinite or very deep recursion, or enormous stack locals).
Because many of these situations are also undefined behavior in C++, the program might appear to work in one build, then crash in another optimization level, or on a different machine. See also: [undefined behavior](/en/blog/cpp-error-10-undefined-behavior/.
Memory protection and virtual addresses
Modern systems give each process a private virtual address space. The kernel and CPU (via the MMU) translate virtual addresses to physical RAM with page granularity (commonly 4 KiB, sometimes larger with huge pages). Each page has flags such as read, write, and execute. When your instruction tries to access an address, the hardware checks the mapping; if the access violates protection or the address is unmapped, the CPU raises a fault, which the OS converts into a signal for user space (for user mode code) or a kernel panic (for bad kernel code on some paths).
Why this matters for C++: your pointers are just integers interpreted as addresses. The compiler and standard library will not, in general, “stop” a bad access at the language level. Tools like ASan and Valgrind insert instrumentation or use binary translation to detect many invalid patterns before the hardware would fault—often with precise line numbers.
SIGSEGV and signal 11
On Linux, run:
kill -l | grep -i segv
or consult man 7 signal. You will see that SIGSEGV is signal 11 (implementation details can vary, but 11 is the de facto value on Linux x86_64 for SIGSEGV).
A segfault is delivered when a thread executes an invalid access. A handler may be installed for SIGSEGV, but in general recovering from a bad memory access in arbitrary C++ code is unsafe. Debuggers, fault handlers, and some VM tricks may catch specific cases, but for application code the appropriate response is: fix the bug, not catch and continue.
Related signals: SIGBUS (bus error) can occur for misaligned access, invalid physical mapping, or other platform‑specific bus faults—sometimes confused with SIGSEGV in bug reports. See the FAQ in the frontmatter.
Common causes (overview)
Instead of another grid: here is how these failures feel when you are tired and the coffee is cold.
Null dereference is the honest crash: the fault lands at a tiny address, and print some_ptr often shows 0x0. Use‑after‑free and dangling pointers are the opposite of honest—the bad access can be pages away from the delete, and ASan or Valgrind earn their keep by naming a line that is closer to the truth than the allocator’s meltdown.
Buffer overflow sometimes explodes immediately, sometimes corrupts metadata and detonates three function calls later. Stack overflow—and here is the joke for people who have been through it—stack overflow? No, not the website: the thread stack actually runs out, and the backtrace starts to look like a broken record. Uninitialized pointers are the lottery: sometimes a segfault, sometimes silent corruption until a release build on a different CPU.
The sections below go deeper with code and mitigations.
Cause: null pointer dereference
Problem: you load or store through a pointer equal to nullptr (or NULL in C).
Broken:
int* p = nullptr;
*p = 42; // SIGSEGV: write to 0x0
Fix patterns:
- Check before use in defensive code paths.
- API design: return
std::optional<T&>orT*with documented nullness; avoid “maybe null, maybe not” without comments. - Smart defaults: use references where null is not a valid state, or
not_null(Guidelines / gsl) in codebases that adopt it.
Example with check:
void assign(int* p) {
if (p == nullptr) {
// handle: throw, return error, or assert in debug
return;
}
*p = 42;
}
Real‑world note: a null dereference in a hot path may be a logic bug (invariant broken) rather than a missing if-statement. Prefer fixing why the pointer is null: failed allocation (rare in modern 32+ bit address spaces for small objects), or missing initialization.
Cause: use‑after‑free and dangling pointers
Problem: memory is deallocated, but a raw pointer or reference still aliases it. Any read/write is undefined behavior; often a segfault when the allocator reuses the address or metadata is corrupted.
Broken:
int* p = new int{7};
delete p;
*p = 1; // UAF: undefined behavior, often ASan/segfault
Broken (dangling to stack):
int* make_bad() {
int x = 0;
return &x; // x dies when function returns: dangling
}
void use() {
int* p = make_bad();
*p = 1; // stack UAF: UB, often crash
}
Fixes:
- Prefer
std::unique_ptr,std::shared_ptr(when shared ownership is real), and containers (std::vector,std::string) over manualnew/delete. - Narrow pointer lifetime to a scope that encloses all uses; avoid returning pointers to inner stack variables.
- For observer patterns, use non-owning pointers only when the pointee’s lifetime is provably longer—often a design smell without clear documentation and runtime checks in debug builds.
Fixed shape (heap):
#include <memory>
void ok() {
auto p = std::make_unique<int>(7);
*p = 1; // valid until p is destroyed
}
Cause: buffer overflow and out‑of‑bounds access
Problem: a write extends past the end of an array or the allocated capacity. This may corrupt heap metadata, adjacent objects, or stack canaries—or hit an unmapped page and fault immediately, depending on layout.
Broken:
void stack_overflow_c() {
int a[3] = {1, 2, 3};
a[3] = 4; // out of bounds: undefined behavior
}
Fixes:
- Use
std::vectorandat()when you want checked access in debug/exception contexts. - Bounds in loops: prefer range‑based
for, algorithms, andstd::array::size(). - Turn on ASan in CI for tests.
With std::vector and checked access:
#include <vector>
void safe(std::size_t i) {
std::vector<int> v{1, 2, 3};
if (i < v.size()) {
v[i] = 99; // or v.at(i) to throw on bad i
}
}
ASan and Valgrind will report stack-buffer-overflow and heap-buffer-overflow with line numbers; the default hardware fault may give no line unless you have symbols and a core.
Cause: stack overflow
Stack overflow? No, not the website—we mean the real kind: the thread’s stack is finite (often a few MB on main thread; smaller on some embedded/thread pools). Deep recursion without a base case, or huge stack allocations (int a[1'000'000]), can exceed the limit.
Broken:
void recurse_forever() {
recurse_forever(); // until stack limit → often SIGSEGV
}
Fixes:
- Add base cases; prove recursion depth.
- Move large arrays to the heap (
std::vector,std::unique_ptr[]). - Convert recursion to iteration where possible.
Shape with heap for large data:
#include <vector>
void work(std::size_t n) {
std::vector<int> buf(n); // allocation on heap, small frame
// ...
}
In GDB, a stack overflow often shows a repeating set of stack frames, or a fault inside low‑level stack check routines—exact behavior is platform and libc dependent.
Cause: uninitialized pointers and indeterminate values
Problem: reading a pointer (or, more generally, automatic storage) before it is set yields indeterminate values in many cases. Dereferencing a garbage address may segfault if it maps to a bad page, or may corrupt memory “silently” if it points into valid but wrong storage.
Broken (illustrative; compile may still warn with -Wuninitialized):
int* p;
// *p = 1; // catastrophic if allowed: UB
Fixes:
- Initialize at declaration:
T* p = nullptr;orauto p = &known_object;. - Enable high warning levels:
-Wall -Wextra -Wuninitialized, and treat warnings as errors in CI. - Valgrind can flag uninitialized values used in branches/addresses; MSVC
/RTC1in Debug builds for stack checks.
Safer pattern:
int x = 0;
int* p = &x; // always valid for x's lifetime
Optional: bad casts and object lifetime
Downcasting in polymorphic hierarchies with static_cast to the wrong dynamic type can produce undefined behavior. Results include corrupt vtable reads and segfaults. Prefer:
dynamic_castto pointers or references in polymorphic hierarchies, and check fornullptr.
struct Base { virtual ~Base() = default; };
struct Derived : Base { int d = 1; };
void f(Base* b) {
if (auto* d = dynamic_cast<Derived*>(b)) {
(void)d->d;
}
}
Stack vs heap segfaults
Stack lives in function frames, alloca, and oversized automatic arrays; it breaks with deep recursion and “I will just put a megabyte on the stack.” Heap comes from new, malloc, and std::vector’s backing store; it fails with unbounded growth (often OOM before a classic segfault on some systems) or ownership mistakes.
UAF on the stack is the classic “return a pointer to a local”; on the heap it is “delete then touch.” ASan will say stack-buffer-overflow versus heap-buffer-overflow or heap-use-after-free—different words, same family argument. In a core file, stack overflow often shows repeating frames; heap corruption often lands you in the allocator or in a *p long after the real bug.
Heuristic for triage: if ASan points to stack in the first bad access, look at stack depth and VLA/huge local patterns. If it is heap, look at ownership and index logic.
Multithreaded segfaults: races and data races
A data race (two un‑synchronized accesses, at least one write) is undefined behavior in C++. It does not “always” crash—on some runs you get wrong answers; on others a segfault if memory is torn or invariants are broken.
Example pattern (buggy):
#include <thread>
int* g = nullptr;
void writer() {
delete g; // if another thread still reads, UB
g = nullptr;
}
void reader() {
if (g) {
*g = 1; // can race with writer
}
}
Mitigations:
- Mutexes, atomics with clear ordering, and message passing instead of ad hoc shared state.
- ThreadSanitizer (
-fsanitize=thread) for data races, separate from AddressSanitizer in many builds (they can be combined in specific configurations, but your project should follow supported compiler docs).
Segfaults that only appear under load (many cores, different timings) are a strong signal for a race or lifetime issue shared across threads.
Debugging workflow: core dumps (POSIX)
A core file is a snapshot of the process at crash time. Enablement and paths vary by OS.
ulimit and core file size
ulimit -c unlimited # allow cores (shell session; put in profile with care)
./your_binary
If cores are not produced, check:
- Current limit:
ulimit -a sysctl/core_patternon Linux:
cat /proc/sys/kernel/core_pattern
Analyzing a core in GDB (preview)
gdb -q ./your_binary core
(gdb) bt
(gdb) thread apply all bt
We expand GDB in the next section.
Windows note: “core dumps” are more commonly minidump/full dumps via Wer or procdump. Use WinDbg or Visual Studio to analyze; concepts (stack, threads) align with GDB, but the commands differ. WSL offers a near‑Linux workflow.
GDB: backtrace, frame, and inspection
Start GDB on a binary:
g++ -g -O0 main.cpp -o app
./app # or gdb --args ./app arg1 arg2
After a segfault in GDB (if you run inside GDB):
(gdb) run
...
Program received signal SIGSEGV, ...
(gdb) bt
(gdb) frame 3
(gdb) list
(gdb) print some_ptr
(gdb) info locals
(gdb) info registers
(gdb) x/16i $pc
Commands I actually use (no table, just muscle memory): bt / backtrace for the stack; bt full when I need locals; frame n or f n to sit on the right frame; up / down to walk it; print / p for values; info threads and thread n when I finally admit it is not the main thread; thread apply all bt for the “async” bug that was really thread 7 all along; disassemble and x/… when I need to stop guessing and look at bits.
On a post‑mortem core:
gdb -q ./your_binary core
(gdb) bt
(gdb) info proc mappings # on some GDB builds / OS combinations
If symbols are stripped, you only see addresses—ship debug symbols in split debug files in production if you need actionable cores. Many shops keep separate symbol packages and use debuginfod on Linux to fetch DWARF.
Optimize for debuggability in dev: build with -O0 or -Og when reproducing, because heavy inlining and reordering complicate stepping.
LLDB: equivalent commands
Launch:
lldb ./your_binary
(lldb) run
GDB habits with an LLDB accent: stack traces are still bt or thread backtrace; frames are frame select n; printing is p or frame variable; threads are thread list and thread backtrace all; breakpoints are breakpoint set -n main; running is still run / r.
Core on macOS (names vary):
lldb -c /cores/core.PID
(lldb) target create ./your_binary
Path and privacy settings for cores differ by macOS version and SIP. Always test your crash pipeline on a non‑prod machine that mirrors production policies.
Valgrind memcheck (Linux, macOS on supported CPUs)
Valgrind runs the program in a virtualized execution environment, intercepting memory operations. It is slower (often 10x–50x) and does not require recompilation, but the tool shines on code where no sanitizer is available or you need a broad scan.
Typical run:
valgrind --leak-check=full --show-leak-kinds=all ./your_binary
What to expect:
- Invalid read/write of heap, stack, or freed memory.
- Use of uninitialized values (when the branch depends on uninit, etc.).
- Leak reports (leaks are not always segfaults, but they matter next to UAF in triage).
Interpreting output: look for the first error—later errors are often follow‑on corruption. Re-run with a smaller test after fixing the first report.
Limitations: does not support every platform (Apple Silicon has had a patchy story—verify your team’s current Valgrind support); not a substitute for ASan in fast developer loops, but a strong cross‑check in CI nightly jobs if runtime allows.
AddressSanitizer (ASan) setup
ASan is a compiler instrumentation pass. It catches many invalid accesses at the moment they happen, with source lines (when you compile with -g).
Command line (GCC/Clang):
g++ -g -O1 -fsanitize=address -fno-omit-frame-pointer -o app main.cpp
./app
CMake (target):
target_compile_options(myapp PRIVATE -fsanitize=address -g -O1)
target_link_options(myapp PRIVATE -fsanitize=address)
What ASan flags include:
- Heap buffer overflow/underflow, use‑after‑free, double free.
- Stack and global buffer overruns in many cases.
- Some leak detection as LSan (optionally included).
Caveat: the crash path must execute for ASan to report it. A latent bug in an untested branch can still reach production. Combine with test coverage and fuzzing for critical parsers.
Suppressions: in large codebases, you may add ASan suppression files for known third‑party false positives; treat suppressions as debt and document owners.
Example ASan error types (illustrative)
heap-buffer-overflowstack-buffer-overflowuse-after-poison(allocator patterns)heap-use-after-free
Related: for undefined behavior besides memory, consider UBSan (-fsanitize=undefined); it complements ASan. Thread Sanitizer is separate: enable with distinct flags and test plans.
Platform differences: Linux, macOS, Windows
Linux is the straight story: fault becomes SIGSEGV, cores follow ulimit and core_pattern, ASan on GCC/Clang is boring in a good way, Valgrind is happiest here, and /proc/self/maps is your friend when scripts need addresses.
macOS speaks the same signal family but you will meet EXC_BAD_ACCESS in lldb; cores land under policies that change across OS versions; Clang ASan tracks Xcode release notes; Valgrind support depends on arch—verify before you bet a release on it; vmmap and sample fill the same mental slots as Linux mapping and profiling.
Windows (native) wraps many faults as access violation 0xC0000005 under SEH; dumps are often WER, procdump, or Visual Studio “Save dump”; MSVC ASan has improved—read the version you actually ship; Valgrind is not the move; VMMap and WinDbg !address are the mapping story.
WSL2 on a Windows host gives you a Linux-shaped toolchain without pretending the native loader is the same as Ubuntu on bare metal—still validate what you ship.
Practical approach: use WSL2 or MSYS2+Clang if you need Linux‑style ASan/Valgrind on a Windows dev machine, while still validating MSVC builds in CI if you ship Windows binaries.
Tools comparison: GDB vs Valgrind vs ASan
GDB/LLDB are for when you already have a corpse: post‑mortem cores, live runs, all threads, registers, and the cold comfort of symbols. They do not invent UB for you—optimized code will still hurt your feelings.
ASan is the daily driver: rebuild, run tests, read a line number, fix the first report, go home earlier. It is not TSan; it will not pat you on the head for data races.
Valgrind is the night shift: no rebuild, broad sweeps on Linux, slow enough to brew tea. On the wrong CPU or OS it simply is not there—plan accordingly.
UBSan catches some non‑memory UB; you have to wire flags like you mean it. TSan is the multithreaded specialist—separate build, real runtime cost. Static analysis before commit catches a different class of mistakes; it also ships false positives you will negotiate with in code review.
A mature workflow uses ASan+UBSan in CI for unit/integration tests, GDB/LLDB for reproducing the one crash a customer sent, and Valgrind when you have a prebuilt binary and time overnight.
CI integration: running with sanitizers
Goals: fail fast in PR pipelines; keep job time bounded; retain repro artifacts (logs, ASan report).
Common recipe (GitHub Actions style sketch):
- name: Configure
run: |
cmake -B build -DCMAKE_CXX_FLAGS="-g -O1 -fsanitize=address -fno-omit-frame-pointer" \
-DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address"
- name: Test
run: ctest --output-on-failure
env:
ASAN_OPTIONS: "detect_stack_use_after_return=1:check_initialization_order=1"
Notes:
- OOM in CI: ASan increases memory. Parallel test jobs may need reduced
-jor more RAM. - Flaky tests: races may appear only in TSan jobs—separate the pipeline stage.
- Cache: sanitized object files should not be mixed with release without clean rebuild.
ASAN_OPTIONS (non‑exhaustive): halt_on_error=0 for continuous discovery (use sparingly; prefer strict failure for determinism in PR builds).
Production: core dump collection and policies
Principles:
- You usually do not run ASan in production (overhead, security surface).
- You do want reliable crash telemetry: at minimum, unhandled signal handler that logs a stack (where async‑signal‑safe) and, if possible, a minidump/core to controlled storage.
- Tighten ulimits and coredump size on servers to avoid filling disks; use journaling of crash id + version + build id.
Linux checklist:
core_patternto a program that packages the core, adds metadata, and uploads to a secure bucket.- Build IDs in ELF and matching debug packages; consider debuginfod.
- If using containers, kernel.core_pattern in the host may differ from what you expect—test inside the real orchestration (Kubernetes
hostPath,initsettings).
What to capture alongside cores:
- Git commit hash / version string
- Command line, environment subset (redact secrets)
- Thread and memory limits (
ulimit -a) - A/B deployment metadata
Prevention: smart pointers, bounds checking, and process
Code:
std::unique_ptrfor exclusive ownership,std::shared_ptronly when you truly have shared lifetimes, avoid cycles (weak_ptr when needed).std::vector/std::array+ algorithms over rawT[].- Span (C++20) to pass non-owning views with length when replacing pointer+size pairs.
- Avoid returning references to static locals unless intentional and documented; avoid returning pointers to inner stack in any function.
Build:
- Warnings as errors,
-Werror=…incrementally, MSVC/W4and/permissive-. - ASan/UBSan in CI, nightly Valgrind for critical paths.
- Code review checklist: ownership, threading, errors from I/O, pointer invariants.
Testing:
- Fuzzing for parsers (
libFuzzer, AFL++). - Stress tests and load tests to shake out races and allocator reuse patterns.
What I wish I knew (personal regrets)
- I wish I had defaulted to
std::unique_ptrand explicit ownership transfer years earlier; every manualdeleteI hand-wrote was a debt I paid with someone else’s weekend. - I wish I had stopped using nullable bare pointers for “required” arguments;
std::optionalwith references (or anot_nullpolicy) would have deleted whole classes of defensiveifnoise. - I wish I had treated bounds as part of the API, not as a comment:
vector::atand realsize()checks in hot loops turn “mysterious allocator crash” into a failing test. - I wish I had taken multithreaded crashes at face value the first time—mutex the shared soup, or delete the shared soup; “rare” was never rare, only under-sampled.
- I wish I had kept ASan and
-gin every dev/CI build I touched; the time I “saved” skipping sanitizer came back as-O0gdb sessions at the wrong hour. - I wish I had separated production reality from lab dreams: ship without ASan if you must, but never ship without a path from core + build id + symbols back to a line.
- I wish I had re-run UBSan/ASan on every third-party bump; the library did not change, until the one week it did, and the segfault had my name on the call graph.
My debugging checklist (evolved over 5 years)
This replaced a tidy flowchart in my notes, because real crashes do not read flowcharts—they arrive sideways.
- Repro or go home — same binary, same inputs, same flags. If I cannot get it, I reach for ASan and
-O0before I reach for theory. - First error wins — especially ASan’s first report. I stopped chasing phantom “root causes” after the fourth downstream explosion.
- Nothing from ASan? — the bad path might not run, or the bug is plain UB that needs UBSan, logging, a minimized test, or a fuzzer—not hope.
- Multithreaded — I run
thread apply all bton core files before I touch source; I run TSan on a real workload, not a unit test with one thread napping. - Crashes only under load — I look for races, shutdown order, and queue lifetimes; I do not “fix” a line until I know which thread owned the lie.
- Third party in the trace — I build a minimal repro with only that library, verify struct layout and lifetime of every buffer I pass, and only then open an upstream issue (often the bug is my “correct” call).
- Regression — I add a test that would have failed yesterday; I add sanitizer to CI until the team complains, then I keep it.
Narrative that survived contact with production: the first actionable signal still wins. In multithreaded code, the all‑threads backtrace in GDB/LLDB is not optional—without it, I have fixed the wrong line with confidence.
Case studies: The Midnight Segfault and It Only Crashes on Friday
Timeline: when adding ASan made it “worse”
Day 1: I added AddressSanitizer and -fno-omit-frame-pointer, rebuilt, and ran the test suite. The build turned red in places that had been “green enough” for months—which felt like a defeat until I realized ASan was doing its job, not spiting me.
Day 2: Worse. The crash count went up: stack-buffer, heap UAF, use-after-poison—different lines, same guilty subsystem. I spent the day wondering if I had made the program more broken. I had not. I had made the lies actionable instead of silent.
Day 3: Found it. I fixed the earliest ASan report in the pipeline, re-ran, watched half the “new” errors evaporate, and the original intermittent failure shrank to a one-line delete paired with a pointer that escaped a callback. I should have started at day one. I am telling you this so you can.
Case 1: The Midnight Segfault
The scene: production build, traffic pattern that only shows up when half the world is asleep, and a core file that does not point at the line you wish it did. Debug builds? Fine. Release? A slow-motion crash after hours.
What I believe now about this class: the CPU is not mean—undefined behavior and uninitialized reads do not sign their work. The optimizer removes branches that were never legal; signed overflow, misaligned reads, and “impossible” states become possible at -O2.
What I do: enable UBSan on a representative run; use git bisect on the week the regression landed; disassemble the suspect function if the source lies; re-read [undefined behavior](/en/blog/cpp-error-10-undefined-behavior/ until humility returns.
When the allocator eats the backtrace — a heap case folded into the same long night: crash deep in free(), “random” stack, panic in the coffee machine. I stop staring at the allocator. I get ASan/Valgrind to show the first invalid write; if those are off, I try weaker guardrails (MALLOC_CHECK_ on Linux) knowing they are a bandage. I fix the earliest corruption, not the loudest frame.
Case 2: It Only Crashes on Friday
The scene: the binary is innocent; the calendar is the input. A weekly batch, a report job, a traffic spike, or a deploy window—Friday in my case meant “enough load and enough cores that timing finally disagrees with my mutex story.”
What it looked like: low frequency, different stacks, more cores → more fun. The debugger said thread A; last week it said thread C. I wanted a deterministic villain; I got a data race in a hand-written queue and a shutdown sequence that freed work items while a pool still had pointers to them.
What I do: thread apply all bt on every core; run TSan on a workload that looks like the bad day, not like my laptop; audit join order and queue lifetime; stop pretending “only under load” means “rare in nature.”
The third-party footnote I still resent: the top of the stack is libfoo.so, the ticket says “not our code,” and the real bug is my buffer that outlives the call because I misread a paragraph in the documentation. I isolate a minimal repro with the library alone, I verify struct layout, and I ASan-link the test binary the way the DSO was meant to be built.
Quick reference: symptoms in prose (no last-minute table)
Null: you print p in GDB and the universe says 0x0. Fix the invariant, use std::optional, or a not_null policy if your codebase uses one.
Dangling / UAF: ASan or Valgrind points at a line you did not expect; a crash just after a callee returns. Prefer unique_ptr, never return addresses into dead stack.
Stack (the hardware kind again): a bt that stutters, or a frame the size of a small planet. Stack overflow? No, not the website—iterate, put big data on the heap, and raise stack limits only when the design truly needs it.
Overrun: ASan shouts *-buffer-overflow. Teach loops about size(), use at() when “throw on bad” is allowed, and pass lengths with std::span when you are honest about APIs.
Uninit: Valgrind’s “Conditional jump or move depends on uninitialised value(s)” is a mood. Initialize at declaration, crank warnings, treat -Wuninitialized as a gift.
Races: flaky tests, Friday-only job pain, more cores → more wrong answers. TSan, mutex discipline, and a shutdown review beat heroic single-threaded thinking.
Bad cast in a polymorphic hierarchy: use dynamic_cast and check for nullptr, or a variant and std::visit if that matches your design.
OS-specific cores: the binary is the same but core_pattern and container settings are not—reproduce in the same class of host you ship.
Related posts (internal)
- GDB/LLDB guide
- Sanitizers
- LNK2019
- [Segmentation fault checklist](/en/blog/cpp-error-27-segmentation-fault/
- Segfault debugging walkthrough
Keywords
Segmentation fault, segfault, SIGSEGV, signal 11, core dump, GDB, LLDB, Valgrind, AddressSanitizer, use-after-free, buffer overflow, stack overflow, data race
Closing: combine core dumps and symbolicated stacks for production post‑mortems, and use ASan/UBSan (and where appropriate, TSan/Valgrind) during development and CI to catch invalid memory and many varieties of undefined behavior before they ship. When a segfault is hard to reproduce, shrink the test, capture all threads, and always fix the earliest tool‑reported fault.
Search: segfault, C++ null pointer, GDB backtrace, ASan heap-use-after-free, Valgrind memcheck, SIGSEGV signal 11
Extended FAQ (inline)
Q. Why does the fault address not match my variable?
A. Optimized code, register promotion, and inlined functions change what you see in the debugger. Use -O0 for stepping; trust ASan’s line over guesswork when available.
Q. Can I catch SIGSEGV and continue?
A. Generally no for application logic—by then memory may be corrupt. For specialized runtimes, consult platform‑specific async‑signal‑safe rules; do not do heavy work in a signal handler.
Q. Are integer overflows the same as segfaults?
A. No in general—signed overflow is UB but often no immediate fault. UBSan finds many of these. Some overflows lead to bad sizes in allocations and then a fault later (indirectly).
Q. What about nullptr in delete?
A. delete of null is a no‑op (safe) for single objects; the danger is delete on not‑allocated or already freed memory—UB and often crash.
See also (related internal links)
- [C++ Undefined Behavior : Why Release-Only Crashes Happen and](/en/blog/cpp-error-10-undefined-behavior/
- [C++ Segmentation Fault: Causes, Debugging, and Prevention](/en/blog/cpp-error-27-segmentation-fault/
- C++ Segmentation fault | core dump
- C++ GDB/LLDB | cout 100개 찍어도 못 찾은 버그, 디버거로 5분 만에 해결
- Rust vs C++ 메모리 안전성 | 컴파일러 오류 차이 [#47-3]