C++ Benchmarking: chrono, Warmup, Statistics, and Google Benchmark
이 글의 핵심
Practical C++ benchmarking: timing, statistics, and Google Benchmark.
What is benchmarking?
Performance measurement ties to stopwatch and benchmark patterns and chrono / time conversion. Running benchmarks before and after performance optimization gives numeric proof of improvement.
Benchmark workflow
graph LR
A[Write Code] --> B[Warmup]
B --> C[Start Measure]
C --> D[Repeat Run]
D --> E[End Measure]
E --> F[Calc Stats]
F --> G{Goal Met?}
G -->|No| H[Optimize]
H --> A
G -->|Yes| I[Done]
#include <chrono>
auto start = std::chrono::high_resolution_clock::now();
// code under test
auto end = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(
end - start
);
std::cout << "Time: " << duration.count() << "μs" << std::endl;
Basic measurement
#include <vector>
template<typename Func>
auto benchmark(Func f, int iterations = 1000) {
using namespace std::chrono;
auto start = high_resolution_clock::now();
for (int i = 0; i < iterations; ++i) {
f();
}
auto end = high_resolution_clock::now();
auto total = duration_cast<microseconds>(end - start);
return total.count() / iterations;
}
int main() {
auto avgTime = benchmark( []{
std::vector<int> v(1000);
});
std::cout << "Average: " << avgTime << "μs" << std::endl;
}
Examples
Example 1: Comparing sort algorithms
#include <algorithm>
#include <vector>
#include <chrono>
void compareSort() {
std::vector<int> data(100000);
std::generate(data.begin(), data.end(), std::rand);
auto data1 = data;
auto start1 = std::chrono::high_resolution_clock::now();
std::sort(data1.begin(), data1.end());
auto end1 = std::chrono::high_resolution_clock::now();
auto time1 = std::chrono::duration_cast<std::chrono::milliseconds>(end1 - start1);
auto data2 = data;
auto start2 = std::chrono::high_resolution_clock::now();
std::stable_sort(data2.begin(), data2.end());
auto end2 = std::chrono::high_resolution_clock::now();
auto time2 = std::chrono::duration_cast<std::chrono::milliseconds>(end2 - start2);
std::cout << "sort: " << time1.count() << "ms" << std::endl;
std::cout << "stable_sort: " << time2.count() << "ms" << std::endl;
}
Sort performance (100,000 elements, illustrative):
| Algorithm | Typical time | Worst case | Stable | Extra memory |
|---|---|---|---|---|
| std::sort | ~8ms | O(N log N) | No | O(log N) |
| std::stable_sort | ~12ms | O(N log² N) | Yes | O(N) |
| std::partial_sort | ~5ms (top 10%) | O(N log K) | No | O(1) |
| std::nth_element | ~2ms (median) | O(N) | No | O(1) |
Example 2: Statistics
#include <vector>
#include <algorithm>
#include <numeric>
class BenchmarkStats {
std::vector<double> samples;
public:
void addSample(double microseconds) {
samples.push_back(microseconds);
}
void printStats() const {
auto sum = std::accumulate(samples.begin(), samples.end(), 0.0);
auto avg = sum / samples.size();
auto sorted = samples;
std::sort(sorted.begin(), sorted.end());
auto median = sorted[sorted.size() / 2];
auto min = *std::min_element(samples.begin(), samples.end());
auto max = *std::max_element(samples.begin(), samples.end());
double variance = 0.0;
for (double s : samples) {
variance += (s - avg) * (s - avg);
}
double stddev = std::sqrt(variance / samples.size());
std::cout << "Mean: " << avg << "μs" << std::endl;
std::cout << "Median: " << median << "μs" << std::endl;
std::cout << "Min: " << min << "μs" << std::endl;
std::cout << "Max: " << max << "μs" << std::endl;
std::cout << "StdDev: " << stddev << "μs" << std::endl;
}
};
What the stats mean:
| Metric | Meaning | Use |
|---|---|---|
| Mean | Average time | Typical performance |
| Median | Middle value | Robust to outliers |
| Min | Best run | Best-case conditions |
| Max | Worst run | Tail latency |
| StdDev | Spread | Stability |
| P95 / P99 | Slow tail | SLA-style targets |
Example 3: Google Benchmark
#include <benchmark/benchmark.h>
#include <vector>
static void BM_VectorPushBack(benchmark::State& state) {
for (auto _ : state) {
std::vector<int> v;
for (int i = 0; i < state.range(0); ++i) {
v.push_back(i);
}
}
}
BENCHMARK(BM_VectorPushBack)->Range(8, 8<<10);
static void BM_VectorReserve(benchmark::State& state) {
for (auto _ : state) {
std::vector<int> v;
v.reserve(state.range(0));
for (int i = 0; i < state.range(0); ++i) {
v.push_back(i);
}
}
}
BENCHMARK(BM_VectorReserve)->Range(8, 8<<10);
BENCHMARK_MAIN();
Example 4: Warmup
template<typename Func>
auto benchmarkWithWarmup(Func f, int warmup, int iterations) {
for (int i = 0; i < warmup; ++i) {
f();
}
BenchmarkStats stats;
for (int i = 0; i < iterations; ++i) {
auto start = std::chrono::high_resolution_clock::now();
f();
auto end = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(
end - start
);
stats.addSample(duration.count());
}
return stats;
}
Benchmarking tips
Checklist for accurate measurement
| Item | Recommendation | Why |
|---|---|---|
| Warmup | 10–100 iterations | Stabilize cache and branch prediction |
| Repetitions | 100–1000 | Statistical significance |
| Anti-DCE | volatile or DoNotOptimize | Prevent optimizing away the work |
| Isolation | Close noisy processes | Less noise |
| CPU affinity | taskset on Linux | Fewer core migrations |
| Release build | -O3 -DNDEBUG | Match production performance |
for (int i = 0; i < 10; ++i) {
f();
}
for (int i = 0; i < 100; ++i) {
benchmark(f);
}
volatile int result = compute();
Common pitfalls
Pitfall 1: Compiler eliminates “useless” work
int result;
auto time = benchmark([&]() {
result = compute();
});
benchmark::DoNotOptimize(result);
Pitfall 2: Cache effects
for (int i = 0; i < 10; ++i) f();
auto time = benchmark(f);
Pitfall 3: Variance
BenchmarkStats stats;
for (int i = 0; i < 100; ++i) {
stats.addSample(benchmark(f));
}
stats.printStats();
Pitfall 4: Timer dominates tiny work
Repeat many times and average.
Google Benchmark setup
git clone https://github.com/google/benchmark.git
cd benchmark
cmake -E make_directory "build"
cmake -E chdir "build" cmake -DBENCHMARK_DOWNLOAD_DEPENDENCIES=on -DCMAKE_BUILD_TYPE=Release ../
cmake --build "build" --config Release
g++ -std=c++17 bench.cpp -lbenchmark -lpthread -o bench
./bench
FAQ
Q1: What is benchmarking?
A: Measuring performance of code paths.
Q2: Warmup?
A: Reduces cold-cache bias.
Q3: Which statistics?
A: Mean, median, standard deviation at minimum.
Q4: Tools?
A: Google Benchmark, perf, VTune.
Q5: Preventing optimization?
A: volatile, benchmark::DoNotOptimize, or careful harness design.
Q6: Resources?
A: Optimized C++, Google Benchmark docs, cppreference.com.
See also: Stopwatch & benchmarks, duration, time conversion, performance optimization.
See also (internal links)
- C++ stopwatch and benchmarks
- C++ duration
- C++ time conversion
- C++ performance optimization
Practical tips
Debugging
- Fix compiler warnings first
- Reproduce with a minimal test
Performance
- Profile before micro-optimizing
- Define measurable targets
Code review
- Follow team conventions
Checklist
Before coding
- Right technique for the problem?
- Maintainable by the team?
- Meets performance requirements?
While coding
- Warnings cleared?
- Edge cases covered?
- Error handling appropriate?
At review
- Intent clear?
- Tests sufficient?
- Documentation adequate?
Keywords
C++, benchmarking, performance, testing, Google Benchmark, chrono.
Related posts
- C++ algorithm sort
- C++ cache optimization
- C++ CMake
- C++ code coverage
- C++ Conan