본문으로 건너뛰기
Previous
Next
C++ thread_local | Thread-Local Storage (TLS) Complete Guide

C++ thread_local | Thread-Local Storage (TLS) Complete Guide

C++ thread_local | Thread-Local Storage (TLS) Complete Guide

이 글의 핵심

C++11 thread_local: per-thread storage, caches, RNGs, initialization, and patterns without shared mutex overhead.

Introduction

C++11 thread_local gives each thread independent storage, which helps you write thread-safe code without synchronizing every access. You can manage per-thread data in multi-threaded programs without locks.

1. thread_local basics

Concept

#include <thread>
#include <iostream>
thread_local int counter = 0;
void func() {
    counter++;
    std::cout << "Thread " << std::this_thread::get_id() 
              << ": " << counter << std::endl;
}
int main() {
    std::thread t1(func);
    std::thread t2(func);
    
    t1.join();
    t2.join();
}

Basic usage

#include <thread>
#include <iostream>
thread_local int x = 0;
void worker() {
    x++;
    std::cout << "Thread " << std::this_thread::get_id() 
              << ": " << x << std::endl;
}
int main() {
    std::thread t1(worker);
    std::thread t2(worker);
    
    t1.join();
    t2.join();
}

2. Practical examples

Example 1: Per-thread request counter

#include <thread>
#include <vector>
#include <iostream>
thread_local size_t requestCount = 0;
void handleRequest() {
    requestCount++;
    std::cout << "Thread " << std::this_thread::get_id()
              << " requests: " << requestCount << std::endl;
}
int main() {
    std::vector<std::thread> threads;
    
    for (int i = 0; i < 5; i++) {
        threads.emplace_back([] {
            for (int j = 0; j < 3; j++) {
                handleRequest();
            }
        });
    }
    
    for (auto& t : threads) {
        t.join();
    }
}

Example 2: Per-thread buffer

#include <thread>
#include <vector>
#include <iostream>
thread_local std::vector<int> buffer;
void flush(const std::vector<int>& buf) {
    std::cout << "Flush: " << buf.size() << " items" << std::endl;
}
void process(int value) {
    buffer.push_back(value);
    
    if (buffer.size() >= 100) {
        flush(buffer);
        buffer.clear();
    }
}
int main() {
    std::thread t1([] {
        for (int i = 0; i < 150; i++) {
            process(i);
        }
    });
    
    t1.join();
}

Example 3: Random number generator

#include <random>
#include <thread>
#include <iostream>
thread_local std::mt19937 rng(std::random_device{}());
int getRandomNumber() {
    std::uniform_int_distribution<int> dist(1, 100);
    return dist(rng);
}
int main() {
    std::thread t1([] {
        for (int i = 0; i < 5; i++) {
            std::cout << "Thread 1: " << getRandomNumber() << std::endl;
        }
    });
    
    std::thread t2([] {
        for (int i = 0; i < 5; i++) {
            std::cout << "Thread 2: " << getRandomNumber() << std::endl;
        }
    });
    
    t1.join();
    t2.join();
}

3. Initialization

At thread start

#include <thread>
#include <iostream>
thread_local int x = 10;
void worker() {
    std::cout << "x = " << x << std::endl;
}
int main() {
    std::thread t1(worker);
    std::thread t2(worker);
    
    t1.join();
    t2.join();
}

First use

#include <thread>
#include <iostream>
int compute() {
    std::cout << "compute() called" << std::endl;
    return 42;
}
void func() {
    thread_local int y = compute();
    std::cout << "y = " << y << std::endl;
}
int main() {
    std::thread t1([] {
        func();
        func();
    });
    
    t1.join();
}

4. Common problems

Problem 1: Destruction order

#include <thread>
#include <iostream>
struct Resource {
    ~Resource() {
        std::cout << "Resource destroyed" << std::endl;
    }
};
thread_local Resource r;
void func() {
    std::cout << "func() running" << std::endl;
}
int main() {
    std::thread t1(func);
    t1.join();
}

Problem 2: Class static members

#include <iostream>
class MyClass {
public:
    static thread_local int x;
};
thread_local int MyClass::x = 0;
int main() {
    MyClass::x = 42;
    std::cout << MyClass::x << std::endl;  // 42
}

Problem 3: Initialization cost

#include <memory>
#include <iostream>
struct ExpensiveObject {
    ExpensiveObject() {
        std::cout << "ExpensiveObject constructed" << std::endl;
    }
};
thread_local std::unique_ptr<ExpensiveObject> obj;
void func() {
    if (!obj) {
        obj = std::make_unique<ExpensiveObject>();
    }
}
int main() {
    func();
    func();
}

Problem 4: Memory usage

#include <vector>
#include <thread>
#include <iostream>
thread_local std::vector<int> largeBuffer(1000000);
void worker() {
    std::cout << "Buffer size: " << largeBuffer.size() << std::endl;
}
int main() {
    std::thread t1(worker);
    std::thread t2(worker);
    
    t1.join();
    t2.join();
}

5. Usage patterns

Pattern 1: Per-thread cache

#include <unordered_map>
#include <string>
thread_local std::unordered_map<std::string, int> cache;
int getValue(const std::string& key) {
    if (cache.find(key) != cache.end()) {
        return cache[key];
    }
    
    int value = computeValue(key);
    cache[key] = value;
    return value;
}

Pattern 2: Per-thread statistics

#include <iostream>
struct Statistics {
    size_t count = 0;
    size_t errors = 0;
    
    void print() {
        std::cout << "Count: " << count << ", Errors: " << errors << std::endl;
    }
};
thread_local Statistics stats;
void processRequest() {
    stats.count++;
}

Summary

Key points

  1. thread_local: independent variable per thread
  2. Initialization: at thread start or first use
  3. Uses: caches, stats, RNGs
  4. Performance: fast access; initialization cost exists
  5. Memory: scales with thread count × variable size

thread_local vs global

Aspectthread_localGlobal
Thread safetyYes (per thread)No (needs sync)
SynchronizationNot for same threadOften required
MemoryPer threadSingle instance
PerformanceFast readsCan be slow with locks

Practical tips

  • Use for per-thread caches
  • Prefer thread_local for RNGs
  • Mind initialization cost
  • Watch total memory with many threads

Next steps



자주 묻는 질문 (FAQ)

Q. 이 내용을 실무에서 언제 쓰나요?

A. C++11 thread_local: per-thread storage, caches, RNGs, initialization, and patterns without shared mutex overhead. Start … 실무에서는 위 본문의 예제와 선택 가이드를 참고해 적용하면 됩니다.

Q. 선행으로 읽으면 좋은 글은?

A. 각 글 하단의 이전 글 또는 관련 글 링크를 따라가면 순서대로 배울 수 있습니다. C++ 시리즈 목차에서 전체 흐름을 확인할 수 있습니다.

Q. 더 깊이 공부하려면?

A. cppreference와 해당 라이브러리 공식 문서를 참고하세요. 글 말미의 참고 자료 링크도 활용하면 좋습니다.


같이 보면 좋은 글 (내부 링크)

이 주제와 연결되는 다른 글입니다.


이 글에서 다루는 키워드 (관련 검색어)

C++, thread_local, TLS, thread, C++11 등으로 검색하시면 이 글이 도움이 됩니다.