본문으로 건너뛰기
Previous
Next
C++ std::vector Basics — Initialization, Operations &

C++ std::vector Basics — Initialization, Operations &

C++ std::vector Basics — Initialization, Operations &

이 글의 핵심

Master std::vector: initialization, operations, capacity, reserve vs resize, avoiding UB and segmentation faults, and practical optimization.

What you get from this article (about 35 minutes)

TL;DR: Master std::vector, the STL workhorse: initialization, capacity, iterator rules, and how to avoid undefined behavior and segmentation faults.

After reading you will be able to:

  • Initialize, access, insert, and erase elements correctly
  • Explain size() vs capacity(), and reserve() vs resize()
  • Avoid common UB cases and apply practical performance patterns

Practical uses:

  • Dynamic arrays with automatic growth
  • Fewer reallocations with reserve
  • Safer access with at() vs []

Level: beginner | Code samples: 20+ | Essential STL


Opening: segmentation fault from index access

“I accessed vec[10] and the program crashed”

std::vector is the STL dynamic array: it grows as needed, stores elements contiguously (cache-friendly), and supports O(1) index access. For arrays vs dynamic arrays from an algorithms perspective, pair this with Arrays and lists (algorithm series). However, operator[] does not check bounds—out-of-range access is undefined behavior (often a segmentation fault).

Analogy: a vector is an “expandable chest of drawers.” If there are only three drawers, opening the tenth is invalid. A C-style array int arr[100] has fixed size; new[]/delete[] are manual. A vector uses RAII for cleanup, and you tune performance with size(), capacity(), and reserve().

Problematic code:

std::vector<int> vec = {1, 2, 3};
int value = vec[10];  // ❌ size is 3; [10] is out of range → UB!

Explanation: vec.size() is 3, so valid indices are 0, 1, and 2. vec[10] is UB; depending on the environment it may crash.

Fixes:

// ✅ at(): throws std::out_of_range if out of range
int value = vec.at(10);  // exception instead of silent UB
// ✅ check before access
if (10 < vec.size()) {
    int value = vec[10];
}

Common failure scenarios

Scenario 1: indexing after reserve

reserve(100) increases capacity only; size() stays 0. Access like vec[0] = 42 is UB. Use resize() or push_back()/emplace_back() to create elements.

Scenario 2: iterator invalidation in an erase loop

After vec.erase(it), it is invalid; ++it is UB. erase returns an iterator to the next element—use it = vec.erase(it).

Scenario 3: one million push_backs take ~10 seconds

Repeated growth triggers reallocations (typical doubling). ~1e6 inserts can mean ~20 reallocations × many element moves. reserve(estimated_count) avoids that.

Scenario 4: modifying the vector inside a range-based for

Calling push_back or erase while iterating for (auto& x : vec) invalidates iterators → UB. Use index loops or explicit iterators when you must mutate.

Scenario 5: data() passed to a C API, then the vector changes

Pointers from vec.data() are invalidated after push_back/insert/erase. Do not mutate the vector while a C API might still use the pointer.

Scenario 6: repeated temporaries with push_back

vec.push_back(MyClass(a, b)) may create a temporary then move/copy. emplace_back(a, b) constructs in place and often avoids the extra temporary.

Scenario 7: returning a large vector

Since C++11, returning by value usually moves (or elides). return vec is typically O(1) transfer cost for the buffer.

Vector vs other containers (summary):

flowchart TB
  subgraph choice[Choosing a container]
    A[Need a dynamic array] --> B{Use case}
    B -->|Index access, append at end| C["std::vector"]
    B -->|Insert/erase at both ends| D["std::deque"]
    B -->|Lookup by key| E["std::map / unordered_map"]
    B -->|Sorted unique keys| F["std::set"]
  end
  subgraph vector_opt[Vector tuning]
    C --> G["reserve to limit reallocations"]
    C --> H["emplace_back to avoid extra copies"]
  end

Production note: this article reflects real issues seen in large C++ codebases—pitfalls and debugging angles that short tutorials often skip.

Table of contents

  1. Problem scenarios
  2. Vector initialization
  3. Vector operations
  4. Capacity
  5. Complete examples
  6. Common errors
  7. Performance tips
  8. Best practices
  9. Production patterns
  10. Checklist

1. Problem scenarios

Scenario 1: crash on empty rows after CSV parsing

Reading lines into a vector and accessing vec[0] without checking for an empty row crashes.

#include <sstream>
#include <string>
#include <vector>
// ❌ unsafe: no empty-line handling
std::vector<std::string> parseLine(const std::string& line) {
    std::vector<std::string> result;
    std::istringstream iss(line);
    std::string cell;
    while (std::getline(iss, cell, ',')) {
        result.push_back(cell);
    }
    return result;
}
// caller
auto cells = parseLine("");  // result is empty
std::string first = cells[0];  // ❌ UB / likely crash

Fix:

// ✅ guard empty vectors
if (!cells.empty()) {
    std::string first = cells[0];
}
// or
std::string first = cells.empty() ? "" : cells[0];

Scenario 2: infinite loop when erasing conditionally

Wrong iterator handling in a removal loop causes infinite loops or crashes.

// ❌ wrong: use iterator after erase without updating
std::vector<int> vec = {1, 2, 2, 3, 2, 4};
for (auto it = vec.begin(); it != vec.end(); ++it) {
    if (*it == 2) {
        vec.erase(it);  // it is invalid; ++it is UB
    }
}

Fix: erase–remove idiom, or use the return value of erase.

// ✅ erase–remove (O(n))
vec.erase(std::remove(vec.begin(), vec.end(), 2), vec.end());
// ✅ manual loop with erase return value
for (auto it = vec.begin(); it != vec.end(); ) {
    if (*it == 2) {
        it = vec.erase(it);
    } else {
        ++it;
    }
}

Scenario 3: reallocations while loading bulk data

Parsing logs or CSV without reserve causes repeated reallocations.

// ❌ slow: many reallocations
std::vector<Record> loadRecords(const std::string& path) {
    std::vector<Record> records;
    std::ifstream file(path);
    Record r;
    while (readRecord(file, r)) {
        records.push_back(r);  // realloc when capacity runs out
    }
    return records;
}

Fix: estimate count (header, file size, etc.) and reserve.

// ✅ faster: reserve up front
std::vector<Record> loadRecords(const std::string& path) {
    size_t estimated = estimateRecordCount(path);
    std::vector<Record> records;
    records.reserve(estimated);
    // ...
}

Scenario 4: passing a huge vector by value

Passing by value copies all elements.

// ❌ slow: copies up to millions of elements
void process(std::vector<int> data) { /* ... */ }
// ✅ read-only: const reference
void process(const std::vector<int>& data) { /* ... */ }
// ✅ transfer ownership: move
void takeOwnership(std::vector<int>&& data) { /* ... */ }

2. Vector initialization

Basic patterns

Pick the constructor that matches intent—clearer code and fewer extra allocations.

#include <vector>
#include <iostream>
int main() {
    // 1. empty
    std::vector<int> v1;
    // 2. size only (default-constructed elements; int → 0)
    std::vector<int> v2(5);  // {0, 0, 0, 0, 0}
    // 3. size + value
    std::vector<int> v3(5, 42);  // {42, 42, 42, 42, 42}
    // 4. initializer list (C++11)
    std::vector<int> v4 = {1, 2, 3, 4, 5};
    // 5. copy
    std::vector<int> v5 = v4;
    // 6. move (C++11)
    std::vector<int> v6 = std::move(v4);  // v4 is empty
    // 7. iterator range copy
    std::vector<int> v7(v5.begin(), v5.end());
    // 8. sub-range
    std::vector<int> v8(v5.begin() + 1, v5.begin() + 4);  // {2, 3, 4}
    return 0;
}

Notes:

  • v2(5): five elements default-initialized. Contrast with v2(5, 42)—parentheses call constructors; braces use list initialization rules.
  • v4 = {1,2,3,4,5}: list initialization.
  • v6 = std::move(v4): steals the buffer; v4 becomes empty.
  • v8: half-open range [begin, end).

v2(5) vs v2{5}

std::vector<int> a(5);    // five elements {0,0,0,0,0}
std::vector<int> b{5};    // one element {5}
std::vector<int> c(5, 2); // five elements {2,2,2,2,2}
std::vector<int> d{5, 2}; // two elements {5, 2}

Explanation: () passes constructor arguments; {} uses initializer-list rules. a(5) is “length 5”; b{5} is “one element with value 5.”

User-defined types

struct Point {
    int x, y;
    Point(int x, int y) : x(x), y(y) {}
};
// emplace_back: construct in place from constructor arguments
std::vector<Point> points;
points.emplace_back(1, 2);      // Point(1, 2) in place
points.push_back(Point(3, 4));  // temporary then move

3. Vector operations

Insert / append

std::vector<int> vec = {1, 2, 3};
vec.push_back(4);        // copy or move
vec.emplace_back(5);   // construct in place
// insert at position (slow: O(n) shifts)
vec.insert(vec.begin(), 0);
vec.insert(vec.begin() + 2, 99);
vec.insert(vec.end(), {10, 11, 12});

Notes: emplace_back forwards arguments to the element constructor—prefer it for heavy types. insert shifts everything after the position.

Erase / clear

std::vector<int> vec = {1, 2, 3, 4, 5};
vec.pop_back();                     // remove last, O(1)
vec.erase(vec.begin());             // erase first, O(n)
vec.erase(vec.begin(), vec.end());  // clear range (like clear)
vec.clear();                        // size 0; capacity may stay

Notes: erase(it) returns the iterator following the removed element. clear() keeps capacity unless you shrink later.

Element access

std::vector<int> vec = {10, 20, 30};
int a = vec[0];   // no check
int b = vec[1];
int c = vec.at(2);  // throws if out of range
int first = vec.front();
int last = vec.back();
int* ptr = vec.data();  // like &vec[0] for non-empty

Notes: prefer at() in debug-heavy or untrusted-index code. data() is invalidated when the vector reallocates or is structurally changed.

Size and capacity

std::vector<int> vec = {1, 2, 3};
size_t n = vec.size();
bool e = vec.empty();
size_t cap = vec.capacity();
vec.reserve(100);    // at least this capacity; size unchanged
vec.resize(10);      // size 10; new elements value-initialized
vec.resize(5);       // shrink: drop tail elements
vec.shrink_to_fit(); // non-binding request to shrink capacity

Notes: reserve does not change size. resize changes size. shrink_to_fit is a hint—implementations may ignore it.

Iterators

std::vector<int> vec = {1, 2, 3, 4, 5};
for (auto it = vec.begin(); it != vec.end(); ++it) {
    std::cout << *it << " ";
}
for (auto it = vec.rbegin(); it != vec.rend(); ++it) {
    std::cout << *it << " ";
}
for (const auto& x : vec) {
    std::cout << x << " ";
}

Use const auto& in range-for when you only read.


4. Capacity

size vs capacity

flowchart LR
    subgraph vec["vector (size=3, capacity=4)"]
        direction TB
        V0["(0) 10"]
        V1["(1) 20"]
        V2["(2) 30"]
        V3["(3) empty slot"]
        V0 --> V1 --> V2 --> V3
    end
    size["size() = 3 — element count"]
    cap["capacity() = 4 — room before reallocation"]

Explanation: size is how many elements are used; capacity is allocated slots. When size would exceed capacity, the next push_back reallocates (typically growth factor ~2).

reserve: allocate ahead

#include <iostream>
#include <vector>
std::vector<int> vec;
vec.reserve(1000);  // capacity ≥ 1000, size still 0
for (int i = 0; i < 1000; ++i) {
    vec.push_back(i);
}
std::cout << "size: " << vec.size() << ", capacity: " << vec.capacity() << "\n";

Growth without reserve

flowchart LR
    subgraph growth[Typical capacity growth]
        G0[0] --> G1[1]
        G1 --> G2[2]
        G2 --> G3[4]
        G3 --> G4[8]
        G4 --> G5[16]
        G5 --> G6["..."]
    end

Roughly 0→1→2→4→8→…; ~1e6 inserts ⇒ on the order of ~20 reallocations without reserve.

shrink_to_fit

std::vector<int> vec(1000);
vec.resize(10);       // size 10, capacity may stay 1000
vec.shrink_to_fit();  // request to fit capacity to size

5. Complete examples

Example 1: push_back vs emplace_back

struct Item { int id; std::string name; Item(int i, std::string n) : id(i), name(std::move(n)) {} };
std::vector<Item> items;
items.reserve(4);
items.push_back(Item(1, "A"));   // temporary + move
items.emplace_back(2, "B");      // construct in place
Item temp(3, "C");
items.push_back(std::move(temp));
items.emplace_back(4, "D");

Example 2: reserve and shrink_to_fit

std::vector<int> vec;
vec.reserve(100);
for (int i = 0; i < 50; ++i) vec.push_back(i);
vec.shrink_to_fit();

Example 3: iteration styles

std::vector<int> vec = {10, 20, 30, 40, 50};
for (size_t i = 0; i < vec.size(); ++i) { /* vec[i] */ }
for (auto it = vec.begin(); it != vec.end(); ++it) { /* *it */ }
for (const auto& x : vec) { /* x */ }
for (auto it = vec.rbegin(); it != vec.rend(); ++it) { /* *it */ }

Do not call push_back/erase inside a range-based for over the same vector.

Example 4: move semantics

std::vector<int> a = {1, 2, 3, 4, 5};
std::vector<int> b = std::move(a);  // a empty, b holds elements
auto getVec = []() {
    std::vector<int> v = {10, 20, 30};
    return v;  // RVO or move
};
std::vector<int> c = getVec();

Example 5: init and basic use

// g++ -std=c++17 -o vec_basic vec_basic.cpp && ./vec_basic
#include <vector>
#include <iostream>
int main() {
    std::vector<int> vec = {10, 20, 30};
    vec.push_back(40);
    vec.emplace_back(50);
    std::cout << "first: " << vec.front() << "\n";
    std::cout << "last: " << vec.back() << "\n";
    std::cout << "size: " << vec.size() << "\n";
    for (size_t i = 0; i < vec.size(); ++i) {
        std::cout << vec[i] << " ";
    }
    std::cout << "\n";
    for (const auto& x : vec) {
        std::cout << x << " ";
    }
    std::cout << "\n";
    return 0;
}

Sample output: first 10, last 50, size 5, then two lines printing 10 20 30 40 50.

Example 6: reserve + emplace_back for loading

#include <vector>
#include <string>
#include <fstream>
#include <sstream>
struct Record {
    int id;
    std::string name;
    double value;
    Record(int i, std::string n, double v) : id(i), name(std::move(n)), value(v) {}
};
std::vector<Record> loadRecords(const std::string& filename) {
    std::vector<Record> records;
    records.reserve(100000);
    std::ifstream file(filename);
    std::string line;
    while (std::getline(file, line)) {
        std::istringstream iss(line);
        int id;
        std::string name;
        double value;
        if (iss >> id >> name >> value) {
            records.emplace_back(id, std::move(name), value);
        }
    }
    return records;
}

Example 7: erase–remove

#include <vector>
#include <algorithm>
std::vector<int> vec = {1, 2, 3, 2, 4, 2, 5};
vec.erase(std::remove(vec.begin(), vec.end(), 2), vec.end());
vec = {1, 2, 3, 4, 5, 6};
vec.erase(std::remove_if(vec.begin(), vec.end(),
    [](int x) { return x % 2 == 0; }), vec.end());

std::remove compacts “kept” elements forward and returns the new logical end; erase from there to end().

Example 8: concatenate vectors

std::vector<int> a = {1, 2, 3};
std::vector<int> b = {4, 5, 6};
a.insert(a.end(), b.begin(), b.end());
std::vector<int> c;
c.reserve(a.size() + b.size());
c.insert(c.end(), a.begin(), a.end());
c.insert(c.end(), b.begin(), b.end());

Example 9: unique after sort

std::vector<int> vec = {3, 1, 2, 2, 1, 3};
std::sort(vec.begin(), vec.end());
vec.erase(std::unique(vec.begin(), vec.end()), vec.end());

6. Common errors

1. Subscript out of range / segfault

Cause: [] with index ≥ size().

std::vector<int> vec = {1, 2, 3};
// int value = vec[10];  // UB
int value = vec.at(10);  // throws
if (index < vec.size()) {
    int value = vec[index];
}

2. Iterator invalidation after erase

Use it = vec.erase(it) or erase–remove.

3. reserve does not set size

std::vector<int> vec;
vec.reserve(100);
// vec[0] = 42;  // UB: size still 0
vec.resize(100);
vec[0] = 42;

4. Invalid iterators during push_back/insert

If you must mutate while iterating, use indices carefully or collect changes separately.

5. data() after reallocation

Do not use raw pointers from data() after operations that reallocate or invalidate.

6. std::vector<bool> specialization

Packed bits; operator[] does not yield a real bool&. Prefer std::vector<uint8_t> or std::vector<char> when you need references or stable addresses.

7. Signed vs unsigned index

size() is size_t. Compare with size_t or use range-for.

8. front() / back() on empty vector

Check empty() first.

9. (5) vs {5} confusion

See initialization section—different constructors.

10. Erasing in range-based for

Use erase–remove or iterator loops with proper erase return handling.

11. Indexing after reserve without resize

Either resize or fill with push_back/emplace_back before [i].


7. Performance tips

Tip 1: reserve for known upper bounds

std::vector<int> vec;
vec.reserve(1000000);
for (int i = 0; i < 1000000; ++i) {
    vec.push_back(i);
}

Tip 2: emplace_back for non-trivial types

struct Point { int x, y; Point(int x, int y) : x(x), y(y) {} };
vec.emplace_back(1, 2);

Tip 3: range-for with references

for (const auto& s : vec) { /* ... */ }

Tip 4: erase–remove vs repeated erase

Single pass O(n) vs repeated O(n²) patterns.

Tip 5: clear() then reuse (capacity often stays)

vec.clear();
// same capacity; subsequent push_backs may not reallocate immediately

Rough comparison table

ScenarioNo reserveWith reserveGain (illustrative)
1e6 int push_backs~50 ms~10 ms~5×
1e5 string push_backs~120 ms~25 ms~4–5×
Heavy types: emplace vs push~2× workbaseline~2×

8. Best practices

  1. reserve when you know approximate final size
  2. emplace_back / move for expensive types
  3. const auto& in read-only range-for
  4. erase–remove for bulk conditional removal
  5. at() when bounds must be checked
  6. Pass const& for reads; && or value when transferring ownership
  7. Return by value—move/RVO applies
  8. Never hold data() pointers across vector mutations that reallocate

9. Production patterns

Buffer reuse

class RequestHandler {
    std::vector<char> read_buffer_;
public:
    void handle(const char* data, size_t len) {
        read_buffer_.clear();
        if (read_buffer_.capacity() < len) {
            read_buffer_.reserve(len);
        }
        read_buffer_.assign(data, data + len);
    }
};

Estimate then reserve

std::vector<Record> loadFromFile(const std::string& path) {
    auto size = std::filesystem::file_size(path);
    size_t estimated = std::max(size / sizeof(Record), size_t(1000));
    std::vector<Record> records;
    records.reserve(estimated);
    return records;
}

Conditional reserve before append

void addItems(std::vector<int>& vec, const std::vector<int>& newItems) {
    if (vec.capacity() - vec.size() < newItems.size()) {
        vec.reserve(vec.size() + newItems.size());
    }
    for (int x : newItems) {
        vec.push_back(x);
    }
}

CSV-style parse

std::vector<std::vector<std::string>> parseCSV(std::istream& in) {
    std::vector<std::vector<std::string>> rows;
    std::string line;
    while (std::getline(in, line)) {
        std::vector<std::string> row;
        row.reserve(16);
        std::istringstream iss(line);
        std::string cell;
        while (std::getline(iss, cell, ',')) {
            row.emplace_back(std::move(cell));
        }
        rows.push_back(std::move(row));
    }
    return rows;
}

Production checklist

  • reserve(estimate) before large inserts
  • Reuse vectors with clear() where capacity retention helps
  • emplace_back / move for heavy elements
  • No data() use across mutating calls
  • Profile before micro-optimizing

10. Implementation checklist

  • reserve when you know rough size
  • emplace_back for expensive types
  • const auto& in read-only loops
  • it = vec.erase(it) in manual erase loops
  • at() when indices are not proven safe
  • Avoid vector<bool> when you need normal references
  • Check empty() before front()/back()

Summary

TopicAPIs
Init{}, (n), (n, val), iterator ranges
Addpush_back, emplace_back, insert
Removepop_back, erase, clear
Access[], at, front, back, data
Capacitysize, capacity, reserve, resize, shrink_to_fit

Principles: reserve when you can; prefer emplace for heavy types; understand size vs capacity; use erase–remove for bulk deletes.


FAQ (inline)

Q. What if reserve is too large?
A. You waste memory. Overshooting reserve does not break correctness, only footprint.

Q. Must I always use emplace_back?
A. It helps most for non-trivial types. For int and similar, push_back is fine.

Q. vector vs array?
A. std::array<T,N> has fixed compile-time size. std::vector grows at runtime.

Q. Where to read next in the series?
A. Follow Previous post links or open the C++ series index.

Q. Deeper references?
A. cppreference and vendor docs.

One-liner: master size vs capacity, use reserve to cut reallocations, and keep iterators valid when you erase or reallocate.

Next: Vector & string performance — reserve & emplace_back

References


  • C++ vector & string performance — the “10 seconds for a million inserts” problem
  • C++ STL algorithms — sort, find, transform with lambdas
  • Range-based for and structured bindings

Keywords

C++, std::vector, vector basics, STL, dynamic array, reserve, capacity, push_back, emplace_back, iterators


More in the series

  • C++ map & set guide
  • C++ container selection
  • C++ lambda expressions
  • C++ std::function
  • C++ vector performance

같이 보면 좋은 글 (내부 링크)

이 주제와 연결되는 다른 글입니다.


이 글에서 다루는 키워드 (관련 검색어)

C++, std::vector, vector basics, STL, dynamic array, reserve, capacity 등으로 검색하시면 이 글이 도움이 됩니다.