C++ 바이너리 직렬화 | "게임 세이브 파일 깨졌어요" 엔디안·패딩 문제 해결

Q: 선행으로 읽으면 좋은 글은?

각 글 하단의 이전 글 링크를 따라가면 순서대로 배울 수 있습니다. C++ 시리즈 목차에서 전체 흐름을 확인할 수 있습니다.

2026년 2월 24일 · 25분 읽기 · 수정 2026년 3월 12일 중급 실습

이 글의 핵심

C++ 바이너리 직렬화에 대한 실전 가이드입니다.

들어가며: 게임 세이브 파일이 깨졌다

“저장한 게임을 불러올 수 없습니다”

게임 진행 상황을 저장하는 기능을 만들었습니다. 하지만 다른 컴퓨터에서 세이브 파일을 열면 데이터가 깨졌습니다.

직렬화 흐름을 시각화하면 다음과 같습니다.

flowchart LR
  subgraph bad["❌ 잘못된 방식"]
    B1[구조체 메모리 덤프] --> B2[패딩·엔디안 포함]
    B2 --> B3[다른 플랫폼에서 깨짐]
  end
  subgraph good["✅ 올바른 방식"]
    G1[필드 단위 직렬화] --> G2[버전·길이·데이터]
    G2 --> G3[이식 가능한 포맷]
  end

문제의 코드:

struct SaveData {
    int level;
    float health;
    char name[32];
};

void saveGame(const SaveData& data) {
    std::ofstream file("save.dat", std::ios::binary);
    file.write(reinterpret_cast<const char*>(&data), sizeof(data));
}

SaveData loadGame() {
    SaveData data;
    std::ifstream file("save.dat", std::ios::binary);
    file.read(reinterpret_cast<char*>(&data), sizeof(data));
    return data;
}

위 코드 설명: 구조체 전체를 reinterpret_cast로 메모리 덤프하면 패딩(정렬용 빈 공간)까지 그대로 저장되고, 플랫폼마다 엔디안·패딩이 다르면 다른 기기에서 읽을 때 깨집니다. std::string처럼 포인터를 가진 타입은 주소만 저장되어 불러와도 의미가 없습니다. 필드 단위·버전·고정 크기 타입으로 규약을 정해 직렬화해야 합니다.

문제점:

구조체를 그대로 저장 (패딩 포함)
엔디안 차이 고려 안 함
버전 관리 없음
포인터나 동적 메모리 처리 안 함

직렬화(serialization)는 데이터를 저장·전송할 수 있는 형태로 바꾸는 것입니다. 바이너리 직렬화는 “메모리 덤프”가 아니라 필드 단위로 쓰고 읽는 규약을 정해 두어야 합니다. CPU·OS마다 패딩(구조체 멤버 사이에 컴파일러가 넣는 정렬용 빈 공간)·엔디안(endianness—바이트 저장 순서. 빅엔디안/리틀엔디안이 플랫폼마다 다를 수 있음)이 다를 수 있으므로, 버전 번호를 넣고, 정수는 고정 크기(uint32_t 등), 문자열은 길이+바이트 순으로 저장하는 식으로 포맷을 고정해 두면 이식성이 좋아집니다.
실무: 다른 언어·플랫폼과 파일을 주고받을 계획이면 JSON·Protocol Buffers 같은 포맷을 쓰는 편이 안전하고, C++끼리만 쓴다면 위처럼 “버전+필드 단위” 규약을 문서로 남겨 두면 디버깅이 수월합니다.

해결 후:

struct SaveData {
    int level;
    float health;
    std::string name;
    
    void serialize(std::ostream& out) const {
        // 버전 정보
        uint32_t version = 1;
        out.write(reinterpret_cast<const char*>(&version), sizeof(version));
        
        // 데이터
        out.write(reinterpret_cast<const char*>(&level), sizeof(level));
        out.write(reinterpret_cast<const char*>(&health), sizeof(health));
        
        // 문자열 길이 + 데이터
        uint32_t nameLen = name.size();
        out.write(reinterpret_cast<const char*>(&nameLen), sizeof(nameLen));
        out.write(name.data(), nameLen);
    }
    
    void deserialize(std::istream& in) {
        uint32_t version;
        in.read(reinterpret_cast<char*>(&version), sizeof(version));
        
        in.read(reinterpret_cast<char*>(&level), sizeof(level));
        in.read(reinterpret_cast<char*>(&health), sizeof(health));
        
        uint32_t nameLen;
        in.read(reinterpret_cast<char*>(&nameLen), sizeof(nameLen));
        name.resize(nameLen);
        in.read(&name[0], nameLen);
    }
};

위 코드 설명: 버전 번호를 먼저 쓰고, level·health를 고정 크기로 쓴 뒤, name은 길이(uint32_t)와 바이트를 순서대로 씁니다. deserialize는 같은 순서로 읽고, name은 resize 후 read로 채웁니다. 문자열처럼 가변 길이 데이터는 “길이+데이터”로 저장해야 불러올 수 있습니다.

이 글을 읽으면:

바이너리 파일을 읽고 쓸 수 있습니다.
구조체를 안전하게 직렬화할 수 있습니다.
엔디안 문제를 이해하고 처리할 수 있습니다.
버전 관리와 호환성을 고려할 수 있습니다.

1. 바이너리 파일 기초

텍스트 vs 바이너리

텍스트 모드에서는 <<로 출력하면 숫자 12345가 문자열 “12345”로 변환되어 5바이트로 저장됩니다. 바이너리 모드에서는 write로 메모리 내용을 그대로 쓰므로 int는 보통 4바이트만 사용합니다. 바이너리는 크기가 작고 읽기/쓰기가 빠르지만, 엔디안·패딩·타입 크기가 플랫폼에 따라 달라질 수 있어서 “포맷 규약”을 정해 두어야 합니다.

// 텍스트 모드
std::ofstream text("data.txt");
text << 12345;  // "12345" (5바이트)

// 바이너리 모드
std::ofstream bin("data.bin", std::ios::binary);
int value = 12345;
bin.write(reinterpret_cast<const char*>(&value), sizeof(value));  // 4바이트

위 코드 설명: 텍스트 모드에서는 12345가 “12345” 문자열로 변환되어 5바이트가 됩니다. 바이너리 모드에서는 메모리에 있는 int 4바이트를 그대로 write하므로 크기가 작고 빠르지만, 플랫폼에 따라 엔디안·타입 크기가 다를 수 있어 포맷 규약이 필요합니다.

차이점:

텍스트: 사람이 읽을 수 있음, 크기 큼
바이너리: 빠름, 크기 작음, 사람이 못 읽음

기본 타입 읽기/쓰기

reinterpret_cast<const char*>(&i)로 해당 변수의 메모리 주소를 “바이트 시퀀스”로 해석하고, sizeof(i)만큼만 씁니다. 읽을 때는 같은 순서로 read하고 같은 타입 변수에 넣으면 됩니다. 쓰기와 읽기 순서가 일치해야 하고, 파일을 열 때 std::ios::binary를 지정하지 않으면 Windows 등에서 개행 문자가 변환되어 바이너리 데이터가 깨질 수 있습니다.

void writeBinary() {
    std::ofstream file("data.bin", std::ios::binary);
    
    int i = 42;
    double d = 3.14;
    bool b = true;
    
    file.write(reinterpret_cast<const char*>(&i), sizeof(i));
    file.write(reinterpret_cast<const char*>(&d), sizeof(d));
    file.write(reinterpret_cast<const char*>(&b), sizeof(b));
}

void readBinary() {
    std::ifstream file("data.bin", std::ios::binary);
    
    int i;
    double d;
    bool b;
    
    file.read(reinterpret_cast<char*>(&i), sizeof(i));
    file.read(reinterpret_cast<char*>(&d), sizeof(d));
    file.read(reinterpret_cast<char*>(&b), sizeof(b));
    
    std::cout << "i=" << i << ", d=" << d << ", b=" << b << "\n";
}

위 코드 설명: reinterpret_cast로 변수 주소를 char*처럼 바이트 시퀀스로 보고, sizeof만큼 write/read합니다. 쓴 순서와 타입이 읽는 순서·타입과 정확히 같아야 하고, 파일은 반드시 ios::binary로 열어야 Windows에서 개행 변환이 일어나지 않습니다.

배열 읽기/쓰기

배열은 요소 개수를 먼저 저장해 두면 읽을 때 크기를 알 수 있습니다. 개수 없이 데이터만 쓰면 읽는 쪽에서 “몇 개인지”를 알 수 없어서, 보통 4바이트나 8바이트로 개수를 쓰고 그 다음에 배열 데이터를 씁니다. 읽을 때는 count를 먼저 읽고, vector<int>(count)로 공간을 잡은 뒤 read로 한 번에 채웁니다.

void writeArray() {
    std::ofstream file("array.bin", std::ios::binary);
    
    int arr[] = {1, 2, 3, 4, 5};
    size_t count = sizeof(arr) / sizeof(arr[0]);
    
    // 개수 먼저 저장
    file.write(reinterpret_cast<const char*>(&count), sizeof(count));
    // 데이터 저장
    file.write(reinterpret_cast<const char*>(arr), sizeof(arr));
}

void readArray() {
    std::ifstream file("array.bin", std::ios::binary);
    
    size_t count;
    file.read(reinterpret_cast<char*>(&count), sizeof(count));
    
    std::vector<int> arr(count);
    file.read(reinterpret_cast<char*>(arr.data()), count * sizeof(int));
    
    for (int val : arr) {
        std::cout << val << " ";
    }
}

위 코드 설명: 먼저 개수(count)를 저장해 두면 읽을 때 크기를 알 수 있습니다. 읽는 쪽에서는 count를 읽고 vector를 그 크기로 잡은 뒤, data()에 read로 한 번에 채웁니다. 개수 없이 데이터만 쓰면 읽는 쪽에서 끝을 알 수 없습니다.

2. 구조체 직렬화

POD 구조체 (Plain Old Data)

POD(Plain Old Data)는 생성자·가상 함수·참조 멤버가 없고, 멤버도 모두 POD인 구조체로, 메모리 레이아웃이 예측 가능합니다. 이런 경우에는 reinterpret_cast로 주소를 넘겨 sizeof(Point)만큼 쓰면 됩니다. 다만 서로 다른 컴파일러·옵션·플랫폼끼리 파일을 주고받을 계획이면 패딩·엔디안이 달라질 수 있으므로, 필드 단위로 쓰거나 #pragma pack 등을 맞춰 두는 편이 안전합니다.

// 복사해 붙여넣은 뒤: g++ -std=c++17 -o point_bin point_bin.cpp && ./point_bin
#include <fstream>
#include <iostream>

struct Point {
    int x;
    int y;
};

void savePoint(const Point& p) {
    std::ofstream file("point.bin", std::ios::binary);
    file.write(reinterpret_cast<const char*>(&p), sizeof(p));
}

Point loadPoint() {
    Point p;
    std::ifstream file("point.bin", std::ios::binary);
    file.read(reinterpret_cast<char*>(&p), sizeof(p));
    return p;
}

int main() {
    savePoint({10, 20});
    Point p = loadPoint();
    std::cout << p.x << " " << p.y << "\n";  // 10 20
    return 0;
}

위 코드 설명: POD(생성자·가상 함수·참조 없고 멤버도 모두 POD)인 Point는 메모리 레이아웃이 예측 가능해, 주소를 넘겨 sizeof만큼 write/read할 수 있습니다. 같은 컴파일러·플랫폼에서만 쓰는 파일이라면 이렇게 해도 되지만, 이식성이 필요하면 필드 단위로 쓰는 편이 안전합니다.

실행 결과: 10 20 이 한 줄로 출력됩니다.

주의: POD 타입만 이렇게 저장 가능!

패딩 문제

struct Data {
    char c;     // 1바이트
    // 3바이트 패딩
    int i;      // 4바이트
    char c2;    // 1바이트
    // 3바이트 패딩
};  // 총 12바이트 (패딩 포함)

// ❌ 나쁜 예: 패딩까지 저장됨
file.write(reinterpret_cast<const char*>(&data), sizeof(data));

// ✅ 좋은 예: 필드별로 저장
file.write(&data.c, sizeof(data.c));
file.write(reinterpret_cast<const char*>(&data.i), sizeof(data.i));
file.write(&data.c2, sizeof(data.c2));

위 코드 설명: 구조체 전체를 쓰면 컴파일러가 넣은 패딩(정렬을 위한 빈 바이트)까지 저장되어, 다른 컴파일러·옵션에서는 레이아웃이 달라질 수 있습니다. 필드별로 write하면 실제 데이터만 저장되므로 이식성이 좋고, 읽는 쪽도 같은 순서로 필드별로 read하면 됩니다.

직렬화 메서드 패턴

struct Player {
    int id;
    float x, y;
    int health;
    
    void save(std::ostream& out) const {
        out.write(reinterpret_cast<const char*>(&id), sizeof(id));
        out.write(reinterpret_cast<const char*>(&x), sizeof(x));
        out.write(reinterpret_cast<const char*>(&y), sizeof(y));
        out.write(reinterpret_cast<const char*>(&health), sizeof(health));
    }
    
    void load(std::istream& in) {
        in.read(reinterpret_cast<char*>(&id), sizeof(id));
        in.read(reinterpret_cast<char*>(&x), sizeof(x));
        in.read(reinterpret_cast<char*>(&y), sizeof(y));
        in.read(reinterpret_cast<char*>(&health), sizeof(health));
    }
};

int main() {
    Player p1{1, 10.5f, 20.3f, 100};
    
    // 저장
    std::ofstream out("player.bin", std::ios::binary);
    p1.save(out);
    
    // 로드
    Player p2;
    std::ifstream in("player.bin", std::ios::binary);
    p2.load(in);
}

위 코드 설명: save/load를 멤버 함수로 두고, 각 필드를 정해진 순서대로 write/read합니다. 쓰는 순서와 읽는 순서가 일치해야 하고, 스트림을 인자로 받으면 파일·메모리 스트림 모두에 같은 방식으로 직렬화할 수 있습니다.

3. 동적 데이터 처리

std::string 직렬화

void writeString(std::ostream& out, const std::string& str) {
    // 길이 저장
    uint32_t len = str.size();
    out.write(reinterpret_cast<const char*>(&len), sizeof(len));
    
    // 문자열 데이터 저장
    out.write(str.data(), len);
}

std::string readString(std::istream& in) {
    // 길이 읽기
    uint32_t len;
    in.read(reinterpret_cast<char*>(&len), sizeof(len));
    
    // 문자열 데이터 읽기
    std::string str(len, '\0');
    in.read(&str[0], len);
    
    return str;
}

위 코드 설명: string은 가변 길이이므로 먼저 길이(uint32_t)를 쓰고, 그 다음 data()와 len만큼 write합니다. 읽을 때는 길이를 읽고, string을 그 크기로 resize한 뒤 &str[0]에 read로 채웁니다. “길이+바이트” 규약이면 어떤 플랫폼에서나 복원할 수 있습니다.

std::vector 직렬화

template <typename T>
void writeVector(std::ostream& out, const std::vector<T>& vec) {
    // 개수 저장
    uint32_t count = vec.size();
    out.write(reinterpret_cast<const char*>(&count), sizeof(count));
    
    // 데이터 저장 (POD 타입만)
    out.write(reinterpret_cast<const char*>(vec.data()), 
              count * sizeof(T));
}

template <typename T>
std::vector<T> readVector(std::istream& in) {
    // 개수 읽기
    uint32_t count;
    in.read(reinterpret_cast<char*>(&count), sizeof(count));
    
    // 데이터 읽기
    std::vector<T> vec(count);
    in.read(reinterpret_cast<char*>(vec.data()), 
            count * sizeof(T));
    
    return vec;
}

위 코드 설명: 벡터도 먼저 개수(count)를 uint32_t로 쓰고, 그 다음 vec.data()와 count*sizeof(T)만큼 write합니다. POD 타입만 이렇게 한 번에 쓸 수 있고, 읽을 때는 count를 읽은 뒤 vec을 그 크기로 잡고 read로 채웁니다.

복잡한 구조체

struct GameState {
    int score;
    std::string playerName;
    std::vector<int> inventory;
    
    void serialize(std::ostream& out) const {
        // score
        out.write(reinterpret_cast<const char*>(&score), sizeof(score));
        
        // playerName
        writeString(out, playerName);
        
        // inventory
        writeVector(out, inventory);
    }
    
    void deserialize(std::istream& in) {
        // score
        in.read(reinterpret_cast<char*>(&score), sizeof(score));
        
        // playerName
        playerName = readString(in);
        
        // inventory
        inventory = readVector<int>(in);
    }
};

위 코드 설명: score는 고정 크기로 직접 write/read하고, playerName은 writeString/readString, inventory는 writeVector/readVector를 사용합니다. 복잡한 구조체는 멤버마다 적절한 직렬화 함수를 호출해 같은 순서로 쓰고 읽으면 됩니다.

완전한 직렬화 예제: 게임 세이브 시스템

아래는 실제 동작하는 게임 세이브/로드 예제입니다. 버전, 매직 넘버, 에러 검증까지 포함합니다.

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <cstdint>

// 공통 헬퍼
void writeString(std::ostream& out, const std::string& str) {
    uint32_t len = static_cast<uint32_t>(str.size());
    out.write(reinterpret_cast<const char*>(&len), sizeof(len));
    out.write(str.data(), len);
}

std::string readString(std::istream& in) {
    uint32_t len;
    in.read(reinterpret_cast<char*>(&len), sizeof(len));
    std::string str(len, '\0');
    in.read(&str[0], len);
    return str;
}

struct GameSave {
    static constexpr uint32_t MAGIC = 0x53415645;  // "SAVE"
    static constexpr uint32_t VERSION = 1;
    
    uint32_t level;
    float health;
    float positionX, positionY;
    std::string playerName;
    std::vector<uint32_t> inventory;
    
    bool save(const std::string& path) const {
        std::ofstream file(path, std::ios::binary);
        if (!file) return false;
        
        file.write(reinterpret_cast<const char*>(&MAGIC), sizeof(MAGIC));
        file.write(reinterpret_cast<const char*>(&VERSION), sizeof(VERSION));
        file.write(reinterpret_cast<const char*>(&level), sizeof(level));
        file.write(reinterpret_cast<const char*>(&health), sizeof(health));
        file.write(reinterpret_cast<const char*>(&positionX), sizeof(positionX));
        file.write(reinterpret_cast<const char*>(&positionY), sizeof(positionY));
        writeString(file, playerName);
        
        uint32_t invCount = static_cast<uint32_t>(inventory.size());
        file.write(reinterpret_cast<const char*>(&invCount), sizeof(invCount));
        file.write(reinterpret_cast<const char*>(inventory.data()), 
                   invCount * sizeof(uint32_t));
        
        return file.good();
    }
    
    bool load(const std::string& path) {
        std::ifstream file(path, std::ios::binary);
        if (!file) return false;
        
        uint32_t magic, version;
        file.read(reinterpret_cast<char*>(&magic), sizeof(magic));
        file.read(reinterpret_cast<char*>(&version), sizeof(version));
        
        if (magic != MAGIC) {
            std::cerr << "Invalid save file format\n";
            return false;
        }
        if (version != VERSION) {
            std::cerr << "Unsupported version: " << version << "\n";
            return false;
        }
        
        file.read(reinterpret_cast<char*>(&level), sizeof(level));
        file.read(reinterpret_cast<char*>(&health), sizeof(health));
        file.read(reinterpret_cast<char*>(&positionX), sizeof(positionX));
        file.read(reinterpret_cast<char*>(&positionY), sizeof(positionY));
        playerName = readString(file);
        
        uint32_t invCount;
        file.read(reinterpret_cast<char*>(&invCount), sizeof(invCount));
        inventory.resize(invCount);
        file.read(reinterpret_cast<char*>(inventory.data()), 
                  invCount * sizeof(uint32_t));
        
        return file.good();
    }
};

int main() {
    GameSave save;
    save.level = 5;
    save.health = 85.0f;
    save.positionX = 100.5f;
    save.positionY = 200.3f;
    save.playerName = "Hero";
    save.inventory = {1, 2, 3, 4, 5};
    
    if (save.save("game_save.dat")) {
        std::cout << "Saved successfully\n";
    }
    
    GameSave loaded;
    if (loaded.load("game_save.dat")) {
        std::cout << "Loaded: Lv." << loaded.level << " " << loaded.playerName 
                  << " HP:" << loaded.health << "\n";
    }
    return 0;
}

실행 결과:

Saved successfully
Loaded: Lv.5 Hero HP:85

4. 엔디안과 이식성

엔디안이란?

엔디안(endianness)은 다중 바이트 데이터를 메모리·파일에 저장할 때 바이트 순서를 어떻게 두는지를 말합니다. 플랫폼마다 다르기 때문에, 같은 파일을 다른 CPU에서 읽으면 숫자가 뒤바뀌어 보일 수 있습니다.

flowchart TB
  subgraph le["Little Endian (x86, ARM 일반)"]
    LE1["0x12345678"] --> LE2["78 56 34 12"]
    LE2 --> LE3["낮은 바이트 먼저"]
  end
  subgraph be["Big Endian (네트워크, 일부 CPU)"]
    BE1["0x12345678"] --> BE2["12 34 56 78"]
    BE2 --> BE3["높은 바이트 먼저"]
  end

// Little Endian (x86, x64): 낮은 바이트가 먼저
// 0x12345678 → 78 56 34 12

// Big Endian (네트워크): 높은 바이트가 먼저
// 0x12345678 → 12 34 56 78

위 코드 설명: 리틀엔디안(x86 등)에서는 낮은 바이트가 먼저 저장되고, 빅엔디안(네트워크 등)에서는 높은 바이트가 먼저 저장됩니다. 같은 파일을 다른 엔디안 기기에서 읽으면 숫자가 바뀌어 보이므로, 이식성이 필요하면 저장/로드 시 바이트 순서를 맞춰야 합니다.

엔디안 변환

#include <cstdint>

uint16_t swap16(uint16_t val) {
    return (val << 8) | (val >> 8);
}

uint32_t swap32(uint32_t val) {
    return ((val & 0xFF000000) >> 24) |
           ((val & 0x00FF0000) >> 8)  |
           ((val & 0x0000FF00) << 8)  |
           ((val & 0x000000FF) << 24);
}

// 또는 표준 함수 사용 (C++23)
#include <bit>
uint32_t val = std::byteswap(original);

위 코드 설명: swap16/swap32는 바이트 순서를 바꿔 엔디안을 통일할 때 씁니다. C++23에서는 std::byteswap으로 같은 동작을 할 수 있습니다. 빅엔디안 기기에서 저장할 때는 리틀엔디안으로 바꿔 쓰거나, 읽을 때 다시 바꿔서 사용합니다.

이식 가능한 저장

class BinaryWriter {
    std::ostream& out;
    
public:
    BinaryWriter(std::ostream& o) : out(o) {}
    
    void writeInt32(int32_t val) {
        // 항상 Little Endian으로 저장
        #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
        val = swap32(val);
        #endif
        out.write(reinterpret_cast<const char*>(&val), sizeof(val));
    }
    
    void writeFloat(float val) {
        // float도 동일하게 처리
        out.write(reinterpret_cast<const char*>(&val), sizeof(val));
    }
};

위 코드 설명: BinaryWriter는 저장 시 “항상 리틀엔디안”으로 통일합니다. __ORDER_BIG_ENDIAN__일 때만 swap32를 적용해 플랫폼과 관계없이 같은 바이트 순서로 파일이 만들어지게 합니다. float도 메모리 덤프이므로 같은 방식으로 처리할 수 있습니다.

엔디안 변환 유틸리티 (완전한 예제)

#include <cstdint>
#include <cstring>
#include <fstream>

inline uint16_t toLittleEndian16(uint16_t val) {
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
    return (val >> 8) | (val << 8);
#else
    return val;
#endif
}

inline uint32_t toLittleEndian32(uint32_t val) {
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
    return ((val & 0xFF) << 24) | ((val & 0xFF00) << 8) |
           ((val & 0xFF0000) >> 8) | ((val & 0xFF000000) >> 24);
#else
    return val;
#endif
}

// float: uint32_t로 재해석 후 변환
inline float toLittleEndianFloat(float val) {
    uint32_t u;
    std::memcpy(&u, &val, sizeof(val));
    u = toLittleEndian32(u);
    float result;
    std::memcpy(&result, &u, sizeof(result));
    return result;
}

// 사용 예: 항상 리틀엔디안으로 저장
void writePortable(std::ostream& out, uint32_t value) {
    uint32_t le = toLittleEndian32(value);
    out.write(reinterpret_cast<const char*>(&le), sizeof(le));
}

5. 실전 직렬화 패턴

버전 관리

struct SaveFile {
    static constexpr uint32_t MAGIC = 0x53415645;  // "SAVE"
    static constexpr uint32_t VERSION = 2;
    
    int score;
    std::string name;
    
    void save(const std::string& filename) const {
        std::ofstream file(filename, std::ios::binary);
        
        // 헤더
        uint32_t magic = MAGIC;
        uint32_t version = VERSION;
        file.write(reinterpret_cast<const char*>(&magic), sizeof(magic));
        file.write(reinterpret_cast<const char*>(&version), sizeof(version));
        
        // 데이터
        file.write(reinterpret_cast<const char*>(&score), sizeof(score));
        writeString(file, name);
    }
    
    bool load(const std::string& filename) {
        std::ifstream file(filename, std::ios::binary);
        if (!file) return false;
        
        // 헤더 검증
        uint32_t magic, version;
        file.read(reinterpret_cast<char*>(&magic), sizeof(magic));
        file.read(reinterpret_cast<char*>(&version), sizeof(version));
        
        if (magic != MAGIC) {
            std::cerr << "Invalid file format\n";
            return false;
        }
        
        if (version > VERSION) {
            std::cerr << "File version too new\n";
            return false;
        }
        
        // 버전별 로드
        if (version == 1) {
            loadV1(file);
        } else if (version == 2) {
            loadV2(file);
        }
        
        return true;
    }
    
private:
    void loadV1(std::istream& in) {
        in.read(reinterpret_cast<char*>(&score), sizeof(score));
        // V1에는 name 없음
        name = "Unknown";
    }
    
    void loadV2(std::istream& in) {
        in.read(reinterpret_cast<char*>(&score), sizeof(score));
        name = readString(in);
    }
};

위 코드 설명: 파일 맨 앞에 MAGIC(파일 형식 식별)과 VERSION을 써 두고, 로드 시 magic이 맞는지·version이 지원 범위인지 검사합니다. 버전별로 loadV1, loadV2를 두면 포맷이 바뀌어도 이전 버전 파일을 읽을 수 있어 호환성을 유지할 수 있습니다.

체크섬 추가

#include <numeric>

uint32_t calculateChecksum(const std::vector<char>& data) {
    return std::accumulate(data.begin(), data.end(), 0u,
                           {
                              return sum + static_cast<uint8_t>(c);
                          });
}

void saveWithChecksum(const std::string& filename, const std::vector<char>& data) {
    std::ofstream file(filename, std::ios::binary);
    
    // 체크섬 계산
    uint32_t checksum = calculateChecksum(data);
    
    // 체크섬 저장
    file.write(reinterpret_cast<const char*>(&checksum), sizeof(checksum));
    
    // 데이터 저장
    uint32_t size = data.size();
    file.write(reinterpret_cast<const char*>(&size), sizeof(size));
    file.write(data.data(), size);
}

bool loadWithChecksum(const std::string& filename, std::vector<char>& data) {
    std::ifstream file(filename, std::ios::binary);
    if (!file) return false;
    
    // 체크섬 읽기
    uint32_t storedChecksum;
    file.read(reinterpret_cast<char*>(&storedChecksum), sizeof(storedChecksum));
    
    // 데이터 읽기
    uint32_t size;
    file.read(reinterpret_cast<char*>(&size), sizeof(size));
    data.resize(size);
    file.read(data.data(), size);
    
    // 체크섬 검증
    uint32_t calculatedChecksum = calculateChecksum(data);
    if (storedChecksum != calculatedChecksum) {
        std::cerr << "Checksum mismatch! File corrupted.\n";
        return false;
    }
    
    return true;
}

위 코드 설명: 저장 시 데이터 전체에 대해 체크섬(여기서는 바이트 합)을 계산해 파일 앞에 쓰고, 그 다음 크기와 데이터를 씁니다. 로드 시 체크섬·크기·데이터를 읽은 뒤, 같은 방식으로 체크섬을 다시 계산해 비교해 일치하지 않으면 손상된 파일로 처리합니다.

6. 자주 발생하는 문제와 해결법

문제 1: “파일을 열었는데 데이터가 이상해요”

원인: std::ios::binary를 빼먹음. Windows에서는 텍스트 모드가 기본이라 \n이 \r\n으로 변환되어 바이너리 데이터가 깨집니다.

// ❌ 잘못된 예
std::ofstream file("data.bin");  // 텍스트 모드!

// ✅ 올바른 예
std::ofstream file("data.bin", std::ios::binary);

문제 2: “다른 PC에서 읽으니 숫자가 완전히 달라요”

원인: 엔디안 차이. x86은 리틀엔디안, 네트워크·일부 임베디드는 빅엔디안입니다.

해결법: 저장 시 항상 한 가지 엔디안(보통 리틀엔디안)으로 통일하고, 빅엔디안 플랫폼에서는 변환 후 저장/로드합니다.

// ✅ 이식 가능한 저장
uint32_t value = 12345;
uint32_t le = toLittleEndian32(value);  // 항상 리틀엔디안으로
out.write(reinterpret_cast<const char*>(&le), sizeof(le));

문제 3: “std::string을 그대로 저장했더니 로드 후 깨짐”

원인: std::string은 내부에 포인터를 갖고 있어, 메모리 덤프하면 주소만 저장됩니다. 다른 프로세스·다른 실행에서는 그 주소가 의미 없습니다.

// ❌ 절대 하면 안 됨
std::string name = "Player";
file.write(reinterpret_cast<const char*>(&name), sizeof(name));  // 주소만 저장됨!

// ✅ 길이 + 데이터
uint32_t len = name.size();
file.write(reinterpret_cast<const char*>(&len), sizeof(len));
file.write(name.data(), len);

문제 4: “길이를 읽었는데 0xFFFFFFFF (엄청 큰 값)”

원인: 읽기 순서가 잘못되었거나, 이전 필드에서 오프셋이 어긋남. 한 필드라도 순서·타입이 틀리면 이후 모든 데이터가 밀립니다.

해결법: 직렬화/역직렬화 순서를 문서화하고, 단위 테스트로 round-trip(저장→로드→비교)을 검증합니다.

// ✅ 순서를 상수로 관리
enum class FieldOrder : size_t {
    MAGIC, VERSION, LEVEL, HEALTH, NAME_LEN, NAME_DATA, INVENTORY_COUNT, INVENTORY
};
// serialize/deserialize에서 이 순서를 엄격히 지킴

문제 5: “파일 끝까지 읽었는데 아직 데이터가 남아 있어요”

원인: size_t를 그대로 저장함. size_t는 32비트/64비트에 따라 4바이트 또는 8바이트라서, 플랫폼마다 크기가 다릅니다.

// ❌ 플랫폼 의존
size_t count = vec.size();
file.write(reinterpret_cast<const char*>(&count), sizeof(count));  // 4 or 8 bytes

// ✅ 고정 크기 타입
uint32_t count = static_cast<uint32_t>(vec.size());
file.write(reinterpret_cast<const char*>(&count), sizeof(count));  // 항상 4 bytes

문제 6: “체크섬은 맞는데 데이터가 이상해요”

원인: 버전 불일치. 새 버전에서 저장한 파일을 구버전에서 읽으면, 새 필드를 건너뛰지 않고 잘못 해석할 수 있습니다.

해결법: 로드 시 버전을 먼저 읽고, 버전별 분기에서 정확한 필드만 읽습니다. 알 수 없는 버전은 거부합니다.

if (version == 1) loadV1(in);
else if (version == 2) loadV2(in);
else {
    std::cerr << "Unsupported version " << version << "\n";
    return false;
}

7. 성능 비교

바이너리 vs 텍스트 vs JSON

방식	10만 개 int 저장	10만 개 int 로드	파일 크기	비고
바이너리	~2ms	~1ms	400KB	가장 빠르고 작음
텍스트 (공백 구분)	~15ms	~25ms	~600KB	사람이 읽기 쉬움
JSON	~80ms	~120ms	~900KB	파싱 오버헤드 큼

예시 수치: 일반적인 PC 기준, 구현에 따라 차이 있음

병목 지점

디스크 I/O: 가장 큰 병목. SSD가 HDD보다 수십 배 빠름.
작은 write 반복: write를 수천 번 호출하면 시스템 콜 오버헤드가 큼.
버퍼링: std::ofstream은 기본 버퍼를 쓰므로, 대량 데이터는 한 번에 쓰는 편이 유리합니다.

최적화 팁

// ❌ 느림: 필드마다 write
for (const auto& item : items) {
    out.write(reinterpret_cast<const char*>(&item.id), sizeof(item.id));
    out.write(reinterpret_cast<const char*>(&item.value), sizeof(item.value));
}

// ✅ 빠름: 버퍼에 모아서 한 번에
std::vector<char> buffer;
buffer.reserve(items.size() * sizeof(Item));
for (const auto& item : items) {
    const char* p = reinterpret_cast<const char*>(&item);
    buffer.insert(buffer.end(), p, p + sizeof(Item));
}
out.write(buffer.data(), buffer.size());

메모리 맵 파일 (대용량)

수백 MB 이상의 대용량 파일은 mmap으로 메모리에 매핑하면, 읽기 성능이 크게 올라갈 수 있습니다. C++17 std::filesystem과 함께 사용할 수 있습니다.

// Linux/macOS 예시 (간략)
#include <sys/mman.h>
#include <fcntl.h>

void readLargeFile(const char* path) {
    int fd = open(path, O_RDONLY);
    size_t size = lseek(fd, 0, SEEK_END);
    void* addr = mmap(nullptr, size, PROT_READ, MAP_PRIVATE, fd, 0);
    
    // addr을 char*처럼 순회
    const char* data = static_cast<const char*>(addr);
    // ... 직렬화 데이터 파싱 ...
    
    munmap(addr, size);
    close(fd);
}

8. 프로덕션 패턴

패턴 1: 원자적 쓰기 (Atomic Write)

저장 중 크래시가 나면 반쯤 쓴 파일이 남을 수 있습니다. 임시 파일에 쓰고, 성공 시 rename으로 교체하면 원자적으로 갱신됩니다.

#include <filesystem>
namespace fs = std::filesystem;

bool saveAtomic(const std::string& path, const std::vector<char>& data) {
    std::string tmpPath = path + ".tmp";
    std::ofstream file(tmpPath, std::ios::binary);
    if (!file || !file.write(data.data(), data.size())) return false;
    file.close();
    
    try {
        fs::rename(tmpPath, path);
        return true;
    } catch (...) {
        fs::remove(tmpPath);
        return false;
    }
}

패턴 2: 백업 + 롤백

중요한 세이브 파일은 덮어쓰기 전에 백업을 두고, 로드 실패 시 이전 버전으로 복구할 수 있게 합니다.

bool saveWithBackup(const std::string& path, const GameSave& save) {
    if (fs::exists(path)) {
        fs::rename(path, path + ".bak");
    }
    if (!save.save(path)) {
        if (fs::exists(path + ".bak")) {
            fs::rename(path + ".bak", path);
        }
        return false;
    }
    fs::remove(path + ".bak");
    return true;
}

패턴 3: 스키마 문서화

직렬화 포맷을 문서로 남겨 두면, 다른 팀원이나 미래의 자신이 유지보수할 때 유리합니다.

[게임 세이브 포맷 v1]
- MAGIC: uint32_t = 0x53415645
- VERSION: uint32_t = 1
- level: uint32_t
- health: float
- positionX, positionY: float
- playerName: uint32_t len + uint8_t[len]
- inventory: uint32_t count + uint32_t[count]
- 모든 정수/float: Little Endian

패턴 4: 라이브러리 선택 가이드

상황	추천
C++ 전용, 단순 구조체	직접 구현 (이 글 방식)
크로스 플랫폼, 복잡한 스키마	Protocol Buffers, FlatBuffers
사람이 읽어야 함	JSON (nlohmann/json), YAML
최대 성능, 제로 카피	FlatBuffers
네트워크 프로토콜	Protocol Buffers, Cap’n Proto

같이 보면 좋은 글 (내부 링크)

이 주제와 연결되는 다른 글입니다.

C++ 캐시 최적화 | 메모리 접근 패턴 바꿔서 성능 10배 향상시키기
C++ 파일 입출력 | ifstream·ofstream으로 “파일 열기 실패” 에러 처리까지
C++ Google Mock | “DB 없이 테스트하고 싶어요” Mock 객체로 의존성 분리

이 글에서 다루는 키워드 (관련 검색어)

C++ 바이너리 직렬화, 직렬화 역직렬화, 파일 저장 구조체, 바이너리 입출력 등으로 검색하시면 이 글이 도움이 됩니다.

정리

항목	내용
바이너리 모드	`std::ios::binary` 플래그
write/read	`reinterpret_cast<char*>` 사용
문자열	길이 + 데이터 저장
벡터	개수 + 데이터 저장
버전 관리	매직 넘버 + 버전 번호
엔디안	플랫폼 독립적 저장
체크섬	데이터 무결성 검증

핵심 원칙:

POD 타입만 직접 저장
동적 데이터는 크기 먼저 저장
버전 정보 포함
체크섬으로 무결성 보장
엔디안 고려

구현 체크리스트

직렬화를 구현할 때 다음을 확인하세요.

std::ios::binary로 파일 열기
uint32_t 등 고정 크기 타입 사용 (size_t 피하기)
문자열/벡터는 “길이 + 데이터” 순서
매직 넘버 + 버전으로 포맷 검증
엔디안 변환 (크로스 플랫폼 시)
체크섬 또는 CRC (선택)
원자적 쓰기 (임시 파일 → rename)
포맷 문서화

자주 묻는 질문 (FAQ)

Q. 이 내용을 실무에서 언제 쓰나요?

A. C++ 바이너리 파일·직렬화(serialization) 완벽 가이드. read·write로 구조체 저장·불러오기, 엔디안(endianness) 문제와 해결법, 패딩·정렬 이슈, 버전 호환성, 실제 게임 세이브 파일 … 실무에서는 위 본문의 예제와 선택 가이드를 참고해 적용하면 됩니다.

Q. 선행으로 읽으면 좋은 글은?

A. 각 글 하단의 이전 글 링크를 따라가면 순서대로 배울 수 있습니다. C++ 시리즈 목차에서 전체 흐름을 확인할 수 있습니다.

Q. 더 깊이 공부하려면?

A. cppreference와 해당 라이브러리 공식 문서를 참고하세요. 글 말미의 참고 자료 링크도 활용하면 좋습니다.

한 줄 요약: 바이너리 모드와 reinterpret_cast로 구조체를 파일에 쓰고 읽을 수 있습니다. 다음으로 stringstream(#11-3)를 읽어보면 좋습니다.

이전 글: C++ 실전 가이드 #11-1: 파일 입출력 기초

다음 글: C++ 실전 가이드 #11-3: stringstream과 포맷팅

C++ 파일 입출력 | ifstream·ofstream으로
C++ 문자열 기초 완벽 가이드 | std::string·C 문자열·string_view와 실전 패턴
C++ 문자열 알고리즘 완벽 가이드 | split·join·trim·replace·정규식 [실전]
C++ stringstream | 문자열 파싱·변환·포맷팅
C++ 람다 기초 완벽 가이드 | 캡처·mutable·제네릭 람다와 실전 패턴