C++ WebSocket Deep Dive | Handshake, Frames, Ping/Pong, Errors, Production Patterns
이 글의 핵심
Deep dive for operators: why idle TCP dies in NATs, fixing 400 handshakes, capping frames, Safari+strand, exponential backoff reconnect, per-session write queues, metrics, and graceful shutdown.
Introduction: “My WebSocket keeps dropping”
Scenario 1: disconnect every ~30 seconds
// ❌ NAT / firewall evicts idle TCP flows
// If the client is silent for longer than the middlebox timer,
// routers may RST the socket.
ws_.async_read(buffer_, {
// ec == connection_reset or connection_aborted
});
Why? NATs and firewalls track sessions and reap idle TCP entries. WebSocket can be quiet for a long time while still logically “open,” so the path looks dead. Mitigation: send Ping/Pong (or app-level keepalives) every 20–30s—shorter than the smallest idle timeout on the path.
Scenario 2: HTTP 400 on handshake
// ❌ Server returns 400 Bad Request
ws_.async_handshake(host, "/chat",
{
// ec == bad_request
// server log: "Missing Sec-WebSocket-Key"
});
Causes: missing Sec-WebSocket-Key, bad Upgrade, wrong version—anything that violates RFC 6455.
Scenario 3: huge messages exhaust RAM
// ❌ Accepting a 100 MiB frame blows the buffer
ws_.async_read(buffer_, {
// bytes == 100 * 1024 * 1024
});
Mitigation: always set read_message_max. Beast defaults are generous; tune per workload.
More scenarios
Scenario 4: Safari/Chrome WSS flakiness
If multiple threads touch the same websocket::stream without a strand, some browsers drop TLS WebSocket sessions randomly—serialize operations.
Scenario 5: reconnect storms
Immediate reconnect loops hammer the server. Use exponential backoff and jitter. Scenario 6: broadcast write storms
Calling async_write for thousands of sessions at once floods the executor—use per-session queues and backpressure.
Goals:
- Byte-level handshake understanding
- Frames with concrete Text/Binary/Ping/Pong/Close examples
- Full heartbeat design
- Error catalog with fixes
- Best practices: reconnect, backpressure, strands
- Production: metrics, graceful shutdown Prerequisites: Boost.Beast 1.70+, C++17.
Mental model
Treat sockets as addresses and async I/O as scheduled delivery—strands keep a single connection’s handlers ordered.
Ops-focused: these patterns come from production C++ services, not toy echo servers.
Contents
- Handshake deep dive
- Frames with worked examples
- Ping/Pong heartbeat
- Beast end-to-end examples
- Common errors
- Best practices
- Production patterns
- Checklist
1. Handshake anatomy
HTTP upgrade request (client → server)
Every WebSocket begins as an HTTP Upgrade request. Example aligned with RFC 6455:
GET /chat HTTP/1.1
Host: example.com:8080
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Origin: https://example.com
Header cheat sheet:
| Header | Required | Notes |
|---|---|---|
Upgrade: websocket | ✅ | Request protocol switch |
Connection: Upgrade | ✅ | HTTP upgrade hop |
Sec-WebSocket-Key | ✅ | 16 random bytes → Base64 (mitigates proxy cache tricks) |
Sec-WebSocket-Version: 13 | ✅ | Only standardized version |
Origin | Recommended | Browser CORS checks |
Sec-WebSocket-Protocol | Optional | Negotiate subprotocols (chat, json, …) |
Generating Sec-WebSocket-Key
#include <random>
#include <boost/beast/core/detail/base64.hpp>
// RFC 6455: 16 random bytes → Base64
std::string generate_websocket_key() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(0, 255);
unsigned char key[16];
for (int i = 0; i < 16; ++i) {
key[i] = static_cast<unsigned char>(dis(gen));
}
std::string result;
result.resize(boost::beast::detail::base64::encoded_size(16));
result.resize(boost::beast::detail::base64::encode(
&result[0], key, 16));
return result;
}
Server response (101 Switching Protocols)
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Computing Sec-WebSocket-Accept
#include <openssl/sha.h>
#include <boost/beast/core/detail/base64.hpp>
std::string compute_accept(const std::string& key) {
const std::string magic = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
std::string input = key + magic;
unsigned char hash[SHA_DIGEST_LENGTH];
SHA1(reinterpret_cast<const unsigned char*>(input.data()),
input.size(), hash);
std::string result;
result.resize(boost::beast::detail::base64::encoded_size(SHA_DIGEST_LENGTH));
result.resize(boost::beast::detail::base64::encode(
&result[0], hash, SHA_DIGEST_LENGTH));
return result;
}
Algorithm: SHA1(Sec-WebSocket-Key + "258EAFA5-E914-47DA-95CA-C5AB0DC85B11") → Base64
Handshake sequence
sequenceDiagram
participant C as Client
participant S as Server
C->>S: TCP connect
C->>S: HTTP GET + Upgrade + Sec-WebSocket-Key
S->>S: Validate key, compute Accept
S->>C: HTTP 101 + Sec-WebSocket-Accept
Note over C,S: WebSocket established
C->>S: WebSocket frames
S->>C: WebSocket frames
Handshake failure modes
| Status | Typical cause |
|---|---|
| 400 Bad Request | Missing Sec-WebSocket-Key, bad Upgrade |
| 403 Forbidden | Origin check failed |
| 426 Upgrade Required | Wrong Sec-WebSocket-Version |
| 503 Service Unavailable | overload / connection cap |
2. Frames with worked examples
Frame layout (RFC 6455)
graph LR
subgraph Header
A[FIN 1bit] --> B[RSV 3bit]
B --> C[Opcode 4bit]
C --> D[Mask 1bit]
D --> E[Payload Len 7bit]
end
E --> F[Extended 0/2/8 byte]
F --> G[Mask Key 0/4 byte]
G --> H[Payload Data]
Opcode reference
| Opcode | Value | Meaning | Direction |
|---|---|---|---|
| Continuation | 0x0 | Continues previous fragment | Both |
| Text | 0x1 | UTF-8 text | Both |
| Binary | 0x2 | Binary payload | Both |
| Close | 0x8 | Close connection | Both |
| Ping | 0x9 | Heartbeat probe | Both |
| Pong | 0xA | Heartbeat reply | Both |
Text frame (masked client → server)
Client → server: “Hello” (5 bytes)
Byte 0: 0x81 (FIN=1, Opcode=0x1 Text)
Byte 1: 0x85 (Mask=1, Payload Len=5)
Bytes 2-5: Masking Key (4 random bytes)
Bytes 6-10: "Hello" XOR Masking Key
// Masking required for client → server
// Example
void mask_payload(uint8_t* data, size_t len, const uint8_t key[4]) {
for (size_t i = 0; i < len; ++i) {
data[i] ^= key[i % 4];
}
}
Why mask: mitigates cache poisoning when broken intermediaries mis-classify traffic.
Ping frame
Byte 0: 0x89 (FIN=1, Opcode=0x9 Ping)
Byte 1: 0x00 (MASK=0 server→client, len=0)
If payload present, Pong echoes it.
Pong frame
Byte 0: 0x8A (FIN=1, Opcode=0xA Pong)
Byte 1: 0x00 (Payload Len=0)
Close frame
Byte 0: 0x88 (FIN=1, Opcode=0x8 Close)
Byte 1: 0x02 (Payload Len=2)
Bytes 2-3: close code (e.g. 1000 normal, 1001 going away, 1002 protocol error)
Bytes 4+: optional UTF-8 reason
Common close codes:
| Code | Meaning |
|---|---|
| 1000 | Normal Closure |
| 1001 | Going away (server shutdown, etc.) |
| 1002 | Protocol Error |
| 1003 | Unsupported Data |
| 1006 | Abnormal closure (no close frame) |
| 1007 | Invalid payload (UTF-8) |
| 1011 | Internal Error |
Handling control frames in Beast
ws_.control_callback(
{
switch (kind) {
case websocket::frame_type::ping:
// Beast sends Pong automatically
break;
case websocket::frame_type::pong:
// observe heartbeat reply
break;
case websocket::frame_type::close:
// peer initiated close
break;
}
});
3. Ping/Pong heartbeat
Ping/Pong sequence
sequenceDiagram
participant C as Client
participant S as Server
loop Every 30s
C->>S: Ping
S->>C: Pong (auto)
end
Note over C: No Pong within 10s
C->>C: Treat as dead → reconnect
Client: send Ping + Pong timeout
class WebSocketClientWithHeartbeat
: public std::enable_shared_from_this<WebSocketClientWithHeartbeat> {
websocket::stream<beast::tcp_stream> ws_;
beast::flat_buffer buffer_;
net::steady_timer ping_timer_;
net::steady_timer pong_timeout_;
bool pong_received_ = false;
public:
explicit WebSocketClientWithHeartbeat(net::io_context& ioc)
: ws_(net::make_strand(ioc)),
ping_timer_(ws_.get_executor()),
pong_timeout_(ws_.get_executor()) {}
void start_heartbeat() {
pong_received_ = true;
schedule_ping();
}
private:
void schedule_ping() {
ping_timer_.expires_after(std::chrono::seconds(30));
ping_timer_.async_wait(
[self = shared_from_this()](beast::error_code ec) {
if (ec) return;
self->send_ping();
});
}
void send_ping() {
pong_received_ = false;
pong_timeout_.expires_after(std::chrono::seconds(10));
pong_timeout_.async_wait(
[self = shared_from_this()](beast::error_code ec) {
if (ec) return;
if (!self->pong_received_) {
std::cerr << "Pong timeout - reconnecting\n";
self->reconnect();
return;
}
});
ws_.async_ping({},
[self = shared_from_this()](beast::error_code ec) {
if (ec) {
std::cerr << "Ping failed: " << ec.message() << "\n";
return;
}
self->schedule_ping();
});
}
void on_pong() {
pong_received_ = true;
pong_timeout_.cancel();
}
void reconnect() {
// reconnect with exponential backoff
}
};
Server: auto Pong to Ping
Beast answers Ping with Pong by default. Custom logging example:
ws_.control_callback(
[self = shared_from_this()](
websocket::frame_type kind, beast::string_view payload) {
if (kind == websocket::frame_type::ping) {
// Beast sends Pong automatically
// manual: ws_.async_pong(payload);
} else if (kind == websocket::frame_type::pong) {
// client Pong after server-initiated Ping
self->on_pong_received();
}
});
Server → client Ping (optional)
Servers may initiate Ping to verify the peer is still alive.
void server_send_ping() {
ws_.async_ping("heartbeat",
[self = shared_from_this()](beast::error_code ec) {
if (ec) {
// write failure ⇒ dead connection
self->close_session();
}
});
}
4. Beast examples
Async client (handshake + read + ping)
#include <boost/beast.hpp>
#include <boost/asio.hpp>
#include <iostream>
namespace beast = boost::beast;
namespace websocket = beast::websocket;
namespace net = boost::asio;
using tcp = net::ip::tcp;
class CompleteWebSocketClient
: public std::enable_shared_from_this<CompleteWebSocketClient> {
websocket::stream<beast::tcp_stream> ws_;
beast::flat_buffer buffer_;
net::steady_timer ping_timer_;
std::string host_;
std::string path_;
public:
explicit CompleteWebSocketClient(net::io_context& ioc)
: ws_(net::make_strand(ioc)),
ping_timer_(ws_.get_executor()) {}
void connect(const std::string& host, const std::string& port,
const std::string& path = "/") {
host_ = host;
path_ = path;
tcp::resolver resolver(ws_.get_executor());
resolver.async_resolve(host, port,
beast::bind_front_handler(&CompleteWebSocketClient::on_resolve,
shared_from_this()));
}
private:
void on_resolve(beast::error_code ec,
tcp::resolver::results_type results) {
if (ec) {
std::cerr << "Resolve: " << ec.message() << "\n";
return;
}
beast::get_lowest_layer(ws_).async_connect(results,
beast::bind_front_handler(&CompleteWebSocketClient::on_connect,
shared_from_this()));
}
void on_connect(beast::error_code ec,
tcp::resolver::results_type::endpoint_type ep) {
if (ec) {
std::cerr << "Connect: " << ec.message() << "\n";
return;
}
ws_.async_handshake(host_, path_,
beast::bind_front_handler(&CompleteWebSocketClient::on_handshake,
shared_from_this()));
}
void on_handshake(beast::error_code ec) {
if (ec) {
std::cerr << "Handshake: " << ec.message() << "\n";
return;
}
std::cout << "WebSocket connected\n";
do_read();
start_ping();
}
void do_read() {
ws_.async_read(buffer_,
beast::bind_front_handler(&CompleteWebSocketClient::on_read,
shared_from_this()));
}
void on_read(beast::error_code ec, std::size_t bytes) {
if (ec) {
if (ec != websocket::error::closed) {
std::cerr << "Read: " << ec.message() << "\n";
}
return;
}
std::cout << "Received: "
<< beast::buffers_to_string(buffer_.data()) << "\n";
buffer_.consume(buffer_.size());
do_read();
}
void start_ping() {
ping_timer_.expires_after(std::chrono::seconds(30));
ping_timer_.async_wait(
[self = shared_from_this()](beast::error_code ec) {
if (ec) return;
self->ws_.async_ping({},
[self](beast::error_code ec) {
if (!ec) self->start_ping();
});
});
}
};
Async echo server
class CompleteWebSocketSession
: public std::enable_shared_from_this<CompleteWebSocketSession> {
websocket::stream<beast::tcp_stream> ws_;
beast::flat_buffer buffer_;
public:
explicit CompleteWebSocketSession(tcp::socket socket)
: ws_(std::move(socket)) {}
void run() {
ws_.set_option(websocket::stream_base::timeout::suggested(
beast::role_type::server));
ws_.read_message_max(64 * 1024); // 64 KiB cap
ws_.async_accept(
beast::bind_front_handler(&CompleteWebSocketSession::on_accept,
shared_from_this()));
}
private:
void on_accept(beast::error_code ec) {
if (ec) {
std::cerr << "Accept: " << ec.message() << "\n";
return;
}
do_read();
}
void do_read() {
ws_.async_read(buffer_,
beast::bind_front_handler(&CompleteWebSocketSession::on_read,
shared_from_this()));
}
void on_read(beast::error_code ec, std::size_t) {
if (ec) {
if (ec == websocket::error::closed) {
std::cout << "Connection closed normally\n";
} else {
std::cerr << "Read: " << ec.message() << "\n";
}
return;
}
ws_.text(ws_.got_text());
ws_.async_write(buffer_.data(),
beast::bind_front_handler(&CompleteWebSocketSession::on_write,
shared_from_this()));
}
void on_write(beast::error_code ec, std::size_t) {
if (ec) {
std::cerr << "Write: " << ec.message() << "\n";
return;
}
buffer_.consume(buffer_.size());
do_read();
}
};
5. Common errors
Error 1: Handshake 400 Bad Request
Symptom: beast::http::error::bad_request
Cause:
- Missing or malformed
Sec-WebSocket-Key - Wrong
Upgradetoken / casing - Missing
Connection: UpgradeFix:
// Beast generates correct headers
// manual implementations must follow RFC 6455
ws_.async_handshake(host, path,
{
if (ec == beast::http::error::bad_request) {
std::cerr << "Check Upgrade, Connection, Sec-WebSocket-Key\n";
}
});
Error 2: bad_version (426 Upgrade Required)
Symptom: server returns 426
Cause: Sec-WebSocket-Version ≠ 13
Fix: Beast defaults to 13; manual stacks must send 13.
Error 3: connection_reset / connection_aborted
Symptom: read/write fails mid-flight Cause:
- NAT/firewall idle timeout
- server restart
- flaky network Fix: heartbeats + reconnect policy
ws_.async_read(buffer_,
[self = shared_from_this()](beast::error_code ec, std::size_t) {
if (ec) {
if (ec == net::error::connection_reset ||
ec == net::error::connection_aborted) {
self->schedule_reconnect();
}
return;
}
// ...
});
Error 4: frame too big / payload too large
Symptom: websocket::error::message_too_big
Cause: payload larger than read_message_max
Fix:
// server: cap incoming messages
ws_.read_message_max(1024 * 1024); // 1MB
// set the same limit on clients
ws_.read_message_max(1024 * 1024);
Error 5: Safari/Chrome WSS flakiness (multithreaded)
Symptom: random drops on Safari/macOS and some Chrome builds Cause: concurrent access to the same stream Fix: serialize with a strand
// stream bound to a strand
auto strand = net::make_strand(ioc);
websocket::stream<beast::tcp_stream> ws_(strand);
// every async op runs on that strand
ws_.async_read(buffer_, net::bind_executor(strand, { ....}));
Error 6: Mask required (client → server)
Symptom: server rejects client frames Cause: RFC 6455 requires masking for client→server frames Fix: Beast masks automatically; manual stacks must set MASK.
Error 7: Invalid UTF-8 (text frames)
Symptom: websocket::error::bad_payload
Cause: invalid UTF-8 in a text frame
Fix:
// send as binary or validate UTF-8 first
ws_.binary(true);
ws_.async_write(net::buffer(data), ...);
Error 8: Double read (overlapping async_read)
Symptom: UB / crashes
Cause: second async_read before the first completes
Fix: chain reads only inside the completion handler
void on_read(beast::error_code ec, std::size_t) {
if (ec) return;
// handle ...
do_read(); // schedule next read here only
}
6. Best practices
1. Cap message size
// rough guidance
// chat JSON: 64 KiB
// JSON API: 1MB
// binary streaming: up to 10 MiB (watch memory)
ws_.read_message_max(64 * 1024);
2. Reconnect with exponential backoff
void schedule_reconnect() {
static int attempt = 0;
auto delay = std::min(
std::chrono::seconds(1) << attempt,
std::chrono::seconds(60));
++attempt;
reconnect_timer_.expires_after(delay);
reconnect_timer_.async_wait(
[this](beast::error_code ec) {
if (!ec) {
connect(host_, port_, path_);
attempt = 0; // reset after success
}
});
}
3. Graceful Close
void close() {
ws_.async_close(websocket::close_code::normal,
[self = shared_from_this()](beast::error_code ec) {
if (ec) {
beast::get_lowest_layer(self->ws_).close();
}
});
}
4. Broadcast backpressure
// ❌ bad: thousands of concurrent writes
for (auto& session : sessions_) {
session->ws_.async_write(...); // floods executor
}
// ✅ good: per-session queues
void broadcast(const std::string& msg) {
for (auto& session : sessions_) {
session->enqueue(msg);
}
}
void enqueue(const std::string& msg) {
bool was_empty = write_queue_.empty();
write_queue_.push(msg);
if (was_empty) do_write();
}
void do_write() {
if (write_queue_.empty()) return;
ws_.async_write(net::buffer(write_queue_.front()),
[this](beast::error_code ec, std::size_t) {
if (!ec) {
write_queue_.pop();
do_write();
}
});
}
5. Configure timeouts
websocket::stream_base::timeout opt{
std::chrono::seconds(30), // handshake timeout
std::chrono::seconds(30), // idle timeout
false // keepalive pings
};
ws_.set_option(opt);
6. Logging
ws_.async_handshake(host, path,
[host, path](beast::error_code ec) {
if (ec) {
spdlog::error("WebSocket handshake failed: {} {} {}",
host, path, ec.message());
}
});
7. Production patterns
Pattern 1: connection limits
class WebSocketServer {
std::atomic<int> connection_count_{0};
static constexpr int max_connections_ = 10000;
void do_accept() {
acceptor_.async_accept(
[this](beast::error_code ec, tcp::socket socket) {
if (ec) return;
if (connection_count_.load() >= max_connections_) {
socket.close();
spdlog::warn("Connection limit reached");
} else {
connection_count_++;
std::make_shared<Session>(std::move(socket),
[this]() { connection_count_--; })->run();
}
do_accept();
});
}
};
Pattern 2: Graceful shutdown
void shutdown() {
acceptor_.close();
for (auto& session : sessions_) {
session->ws_.async_close(websocket::close_code::going_away,
{});
}
work_guard_.reset();
ioc_.stop();
}
Pattern 3: Metrics
struct WebSocketMetrics {
std::atomic<uint64_t> connections_total{0};
std::atomic<uint64_t> connections_active{0};
std::atomic<uint64_t> messages_received{0};
std::atomic<uint64_t> messages_sent{0};
std::atomic<uint64_t> errors_handshake{0};
std::atomic<uint64_t> errors_read{0};
};
// export to Prometheus/Grafana/etc.
void on_handshake(beast::error_code ec) {
if (ec) {
metrics_.errors_handshake++;
return;
}
metrics_.connections_active++;
}
Pattern 4: Subprotocol negotiation
// client
ws_.set_option(websocket::stream_base::decorator(
{
req.set(beast::http::field::sec_websocket_protocol,
"chat, json");
}));
// server: pick one during accept
ws_.set_option(websocket::stream_base::decorator(
{
res.set(beast::http::field::sec_websocket_protocol, "chat");
}));
Pattern 5: WSS (TLS)
using ssl_stream = boost::asio::ssl::stream<beast::tcp_stream>;
websocket::stream<ssl_stream> wss_(ssl_ctx, net::make_strand(ioc));
// TLS first, then WebSocket upgrade
ssl_stream_.async_handshake(ssl::stream_base::client,
[this](beast::error_code ec) {
if (!ec) {
wss_.async_handshake(host_, path_, ...);
}
});
8. Checklist
Handshake
- Random 16-byte key, Base64
-
Sec-WebSocket-AcceptSHA1+magic+Base64 -
Upgrade: websocket,Connection: Upgrade -
Sec-WebSocket-Version: 13
Frames
- Client→server frames masked
- Validate UTF-8 for text frames
- Configure
read_message_max - Close frames include code + optional reason
Ping/Pong
- Ping every 20–30s
- Reconnect if Pong missing ~10s
- Server answers Ping with Pong
Errors
-
connection_reset→ reconnect -
message_too_big→ enforce caps - Log handshake failures; retry with backoff
Production
- Use strands under concurrency
- Exponential backoff reconnects
- Enforce max concurrent sockets
- Graceful shutdown
- Metrics and structured logging
References
Summary
| Topic | Detail |
|---|---|
| Handshake | HTTP Upgrade + Sec-WebSocket-Key / Accept |
| Frames | FIN, opcode, mask bit, payload |
| Masking | Required client→server |
| Ping/Pong | ~30s heartbeat; reconnect if Pong missing ~10s |
| Errors | 400/426, connection_reset, message_too_big |
| Production | strands, backpressure, backoff, metrics |
FAQ
Handshake returns 400
Verify Sec-WebSocket-Key, Upgrade, and Connection. Beast sets these correctly when you use async_handshake / async_accept.
Safari drops WSS
Construct the stream on a strand (net::make_strand) and bind every async op to it.
Broadcast slows the server
Use per-session write queues and sequential writes (backpressure) instead of fanning out thousands of concurrent async_write calls.
What should I read first?
C++ WebSocket fundamentals #30-1, then this article.
Next: Protocol & serialization #30-3 (Korean)
Previous: WebSocket fundamentals #30-1
Related
- C++ WebSocket fundamentals #30-1
- C++ SSL/TLS with Asio #30-2
- C++ chat server #31-1
Keywords
C++, WebSocket, Beast, handshake, frames, Ping, Pong, production, errors.