Custom C++ Memory Pools: Fixed Blocks, TLS, and Benchmarks [#48-3]
이 글의 핵심
Design fixed-block pools with intrusive free lists, compare to malloc, add thread-local pools for lock-free hot paths, and avoid double-free and use-after-pool-free.
Introduction: “When new/delete show up in the profiler”
Why memory pools?
Many same-sized allocations and frees amplify heap fragmentation and allocator cost. This article implements fixed-block pools, thread-local pools, object pools, frame allocators, and stack allocators, and compares against ::operator new with benchmarks.
See also: Network handler allocators, PMR allocators.
Scenarios
| Problem | Pool angle |
|---|---|
malloc hot in profiler | Fixed blocks + reuse |
| Long-run OOM despite free RAM | Reduce global heap fragmentation |
| Many threads, poor scaling | Thread-local pools avoid heap lock contention |
| Cache misses on pointer chasing | Contiguous pool storage |
Design: fixed blocks
- Allocate a chunk (e.g. with
::operator new). - Split into fixed-size blocks; first bytes of free blocks store next pointers (intrusive free list).
allocate: pop head;deallocate: push head (LIFO).- Expand with new chunks when the free list is empty.
- Alignment: round block and chunk sizes to
alignof(std::max_align_t). - Fallback: for requests larger than block size, delegate to global
new.
Thread-local pool
thread_local FixedBlockPool gives lock-free alloc/free on that thread—good for per-thread workers.
Game-oriented patterns
| Pattern | Lifetime | Typical use |
|---|---|---|
| Object pool | Per object | Bullets, particles |
| Frame allocator | Per frame | Temp per-frame data |
| Stack allocator | LIFO scope | Scratch buffers |
Placement new on pooled storage + explicit destructor on release.
Benchmarking
Compare N allocate+free cycles: batch vs interleaved; scale thread count with TLS pools. Numbers are machine-specific—always re-measure on your target.
Common mistakes
- Pool destroyed before objects → UAF; pool must outlive users.
- Wrong pool on deallocate → UB.
- Object larger than block → overflow; use fallback or larger class.
- Double free → corrupt free list.
- Misaligned blocks → UB on some types—pad/align.
- Global pool + multi-thread without TLS/mutex → data race.
- Frame allocator
resetthen use old pointers → UAF.
Production patterns
- Size-class pools (64/128/256…) for varied small sizes.
- Monitored pool counters (peak usage, allocations).
- RAII wrappers with pool deleters.
- Asio handlers allocated from pools when sizes are bounded—see network guide.
Related posts
- Memory pool guide
- PMR
- Custom allocator article
FAQ
Q. Block size?
A. At least max(sizeof(T), sizeof(void*)); profile real allocation sizes.
Q. Next read?
A. Segfault debugging #49-1
Summary
Fixed pools cut allocator overhead and fragmentation for uniform hot allocations; TLS avoids locks; validate with benchmarks and watch lifetime and ownership at pool boundaries.
Previous: HTTP framework #48-2