Custom C++ Memory Pools: Fixed Blocks, TLS, and Benchmarks [#48-3]

Custom C++ Memory Pools: Fixed Blocks, TLS, and Benchmarks [#48-3]

이 글의 핵심

Design fixed-block pools with intrusive free lists, compare to malloc, add thread-local pools for lock-free hot paths, and avoid double-free and use-after-pool-free.

Introduction: “When new/delete show up in the profiler”

Why memory pools?

Many same-sized allocations and frees amplify heap fragmentation and allocator cost. This article implements fixed-block pools, thread-local pools, object pools, frame allocators, and stack allocators, and compares against ::operator new with benchmarks.

See also: Network handler allocators, PMR allocators.


Scenarios

ProblemPool angle
malloc hot in profilerFixed blocks + reuse
Long-run OOM despite free RAMReduce global heap fragmentation
Many threads, poor scalingThread-local pools avoid heap lock contention
Cache misses on pointer chasingContiguous pool storage

Design: fixed blocks

  • Allocate a chunk (e.g. with ::operator new).
  • Split into fixed-size blocks; first bytes of free blocks store next pointers (intrusive free list).
  • allocate: pop head; deallocate: push head (LIFO).
  • Expand with new chunks when the free list is empty.
  • Alignment: round block and chunk sizes to alignof(std::max_align_t).
  • Fallback: for requests larger than block size, delegate to global new.

Thread-local pool

thread_local FixedBlockPool gives lock-free alloc/free on that thread—good for per-thread workers.


Game-oriented patterns

PatternLifetimeTypical use
Object poolPer objectBullets, particles
Frame allocatorPer frameTemp per-frame data
Stack allocatorLIFO scopeScratch buffers

Placement new on pooled storage + explicit destructor on release.


Benchmarking

Compare N allocate+free cycles: batch vs interleaved; scale thread count with TLS pools. Numbers are machine-specific—always re-measure on your target.


Common mistakes

  1. Pool destroyed before objects → UAF; pool must outlive users.
  2. Wrong pool on deallocate → UB.
  3. Object larger than block → overflow; use fallback or larger class.
  4. Double free → corrupt free list.
  5. Misaligned blocks → UB on some types—pad/align.
  6. Global pool + multi-thread without TLS/mutex → data race.
  7. Frame allocator reset then use old pointers → UAF.

Production patterns

  • Size-class pools (64/128/256…) for varied small sizes.
  • Monitored pool counters (peak usage, allocations).
  • RAII wrappers with pool deleters.
  • Asio handlers allocated from pools when sizes are bounded—see network guide.

  • Memory pool guide
  • PMR
  • Custom allocator article

FAQ

Q. Block size?
A. At least max(sizeof(T), sizeof(void*)); profile real allocation sizes.

Q. Next read?
A. Segfault debugging #49-1


Summary

Fixed pools cut allocator overhead and fragmentation for uniform hot allocations; TLS avoids locks; validate with benchmarks and watch lifetime and ownership at pool boundaries.

Previous: HTTP framework #48-2