C++ Benchmarking | A Guide to Benchmarking

C++ Benchmarking | A Guide to Benchmarking

이 글의 핵심

C++ microbenchmarking with chrono and Google Benchmark: warmup, iterations, variance, and measuring real optimizations—not noise.

What is Benchmarking?

Performance measurement is similar to using stopwatch benchmarking with chrono and time conversion. By running benchmarks before and after performance optimization, you can quantify the improvements.

Benchmarking Process

graph LR
    A[Write Code] --> B[Warmup]
    B --> C[Start Measure]
    C --> D[Repeat Run]
    D --> E[End Measure]
    E --> F[Calc Stats]
    F --> G{Goal Met?}
    G -->|No| H[Optimize]
    H --> A
    G -->|Yes| I[Done]
#include <chrono>

auto start = std::chrono::high_resolution_clock::now();

// Execute code

auto end = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(
    end - start
);

std::cout << "Time: " << duration.count() << "μs" << std::endl;

Basic Measurement

template<typename Func>
auto benchmark(Func f, int iterations = 1000) {
    using namespace std::chrono;
    
    auto start = high_resolution_clock::now();
    
    for (int i = 0; i < iterations; ++i) {
        f();
    }
    
    auto end = high_resolution_clock::now();
    auto total = duration_cast<microseconds>(end - start);
    
    return total.count() / iterations;
}

int main() {
    auto avgTime = benchmark([]() {
        std::vector<int> v(1000);
    });
    
    std::cout << "Average: " << avgTime << "μs" << std::endl;
}

Practical Examples

Example 1: Comparing Algorithms

#include <algorithm>
#include <vector>
#include <chrono>

void compareSort() {
    std::vector<int> data(100000);
    std::generate(data.begin(), data.end(), std::rand);
    
    // std::sort
    auto data1 = data;
    auto start1 = std::chrono::high_resolution_clock::now();
    std::sort(data1.begin(), data1.end());
    auto end1 = std::chrono::high_resolution_clock::now();
    auto time1 = std::chrono::duration_cast<std::chrono::milliseconds>(end1 - start1);
    
    // std::stable_sort
    auto data2 = data;
    auto start2 = std::chrono::high_resolution_clock::now();
    std::stable_sort(data2.begin(), data2.end());
    auto end2 = std::chrono::high_resolution_clock::now();
    auto time2 = std::chrono::duration_cast<std::chrono::milliseconds>(end2 - start2);
    
    std::cout << "sort: " << time1.count() << "ms" << std::endl;
    std::cout << "stable_sort: " << time2.count() << "ms" << std::endl;
}

Performance Comparison of Sorting Algorithms (100,000 elements):

AlgorithmAverage TimeWorst-Case ComplexityStabilityMemory
std::sort~8msO(N log N)O(log N)
std::stable_sort~12msO(N log² N)O(N)
std::partial_sort~5ms (Top 10%)O(N log K)O(1)
std::nth_element~2ms (Median)O(N)O(1)

Example 2: Collecting Statistics

#include <vector>
#include <algorithm>
#include <numeric>

class BenchmarkStats {
    std::vector<double> samples;
    
public:
    void addSample(double microseconds) {
        samples.push_back(microseconds);
    }
    
    void printStats() const {
        auto sum = std::accumulate(samples.begin(), samples.end(), 0.0);
        auto avg = sum / samples.size();
        
        auto sorted = samples;
        std::sort(sorted.begin(), sorted.end());
        auto median = sorted[sorted.size() / 2];
        
        auto min = *std::min_element(samples.begin(), samples.end());
        auto max = *std::max_element(samples.begin(), samples.end());
        
        // Standard deviation
        double variance = 0.0;
        for (double s : samples) {
            variance += (s - avg) * (s - avg);
        }
        double stddev = std::sqrt(variance / samples.size());
        
        std::cout << "Average: " << avg << "μs" << std::endl;
        std::cout << "Median: " << median << "μs" << std::endl;
        std::cout << "Min: " << min << "μs" << std::endl;
        std::cout << "Max: " << max << "μs" << std::endl;
        std::cout << "StdDev: " << stddev << "μs" << std::endl;
    }
};

Meaning of Statistical Metrics:

MetricMeaningUsage
MeanOverall average timeGeneral performance
MedianMiddle valueOutlier removal
MinBest performanceOptimal conditions
MaxWorst performanceWorst-case scenario
StdDevVariabilityStability evaluation
Percentile (P95, P99)Top 5%, 1%SLA benchmarks

Example 3: Google Benchmark

#include <benchmark/benchmark.h>
#include <vector>

static void BM_VectorPushBack(benchmark::State& state) {
    for (auto _ : state) {
        std::vector<int> v;
        for (int i = 0; i < state.range(0); ++i) {
            v.push_back(i);
        }
    }
}

BENCHMARK(BM_VectorPushBack)->Range(8, 8<<10);

static void BM_VectorReserve(benchmark::State& state) {
    for (auto _ : state) {
        std::vector<int> v;
        v.reserve(state.range(0));
        for (int i = 0; i < state.range(0); ++i) {
            v.push_back(i);
        }
    }
}

BENCHMARK(BM_VectorReserve)->Range(8, 8<<10);

BENCHMARK_MAIN();

Example 4: Warmup

template<typename Func>
auto benchmarkWithWarmup(Func f, int warmup, int iterations) {
    // Warmup
    for (int i = 0; i < warmup; ++i) {
        f();
    }
    
    // Measurement
    BenchmarkStats stats;
    for (int i = 0; i < iterations; ++i) {
        auto start = std::chrono::high_resolution_clock::now();
        f();
        auto end = std::chrono::high_resolution_clock::now();
        
        auto duration = std::chrono::duration_cast<std::chrono::microseconds>(
            end - start
        );
        stats.addSample(duration.count());
    }
    
    return stats;
}

Benchmarking Tips

Checklist for Accurate Measurement

ItemRecommendationReason
Warmup10-100 runsOptimize CPU cache, branch prediction
Repetitions100-1000 runsEnsure statistical significance
Prevent Optimizationvolatile or DoNotOptimizeAvoid compiler optimizations
Isolated RunsClose other processesMinimize noise
CPU Pinningtaskset (Linux)Prevent core switching
Release Build-O3 -DNDEBUGMeasure real performance
// 1. Warmup
for (int i = 0; i < 10; ++i) {
    f();  // Warm up cache
}

// 2. Multiple Measurements
for (int i = 0; i < 100; ++i) {
    benchmark(f);
}

// 3. Prevent Optimization
volatile int result = compute();

// 4. Statistical Analysis
// Mean, Median, StdDev

Common Issues

Issue 1: Optimization

// ❌ Removed by optimization
auto time = benchmark([]() {
    int x = 42;
    return x;
});

// ✅ Use the result
int result;
auto time = benchmark([&]() {
    result = compute();
});
benchmark::DoNotOptimize(result);

Issue 2: Cache

// ❌ Cache effects
auto time1 = benchmark(f);  // Cache miss
auto time2 = benchmark(f);  // Cache hit (faster)

// ✅ Warmup
for (int i = 0; i < 10; ++i) f();
auto time = benchmark(f);

Issue 3: Variability

// Single measurement is inaccurate
auto time = benchmark(f);

// ✅ Multiple measurements
BenchmarkStats stats;
for (int i = 0; i < 100; ++i) {
    stats.addSample(benchmark(f));
}
stats.printStats();

Issue 4: Measurement Overhead

// Very short task
auto time = benchmark([]() {
    int x = 1 + 1;
});

// Measurement overhead > Actual time
// Repeat multiple times and average

Google Benchmark

# Installation
git clone https://github.com/google/benchmark.git
cd benchmark
cmake -E make_directory "build"
cmake -E chdir "build" cmake -DBENCHMARK_DOWNLOAD_DEPENDENCIES=on -DCMAKE_BUILD_TYPE=Release ../
cmake --build "build" --config Release

# Compilation
g++ -std=c++17 bench.cpp -lbenchmark -lpthread -o bench

# Execution
./bench

FAQ

Q1: What is Benchmarking?

A: Measuring performance.

Q2: What is Warmup?

A: Removing cache effects.

Q3: What are Statistics?

A: Mean, Median, StdDev.

A: Google Benchmark, perf, vtune.

Q5: How to Prevent Optimization?

A: Use volatile or DoNotOptimize.

Q6: Learning Resources?

A:

  • “Optimized C++”
  • “Google Benchmark Docs”
  • cppreference.com

Related Posts: Stopwatch Benchmarking, Duration, Time Conversion, Performance Optimization.