WebRTC: Realtime Communication | Signaling, ICE, STUN/TURN, DTLS & SRTP

WebRTC: Realtime Communication | Signaling, ICE, STUN/TURN, DTLS & SRTP

이 글의 핵심

WebRTC: UDP media with ICE/STUN/TURN path discovery, DTLS for control and keys, SRTP for encrypted media—the browser-native realtime stack.

Introduction

WebRTC (Web Real-Time Communication) is a W3C / IETF bundle for plugin-free voice, video, and data channels in browsers and native apps—P2P or near-P2P. It combines NAT traversal, dynamic codec negotiation, and encrypted media—used for video calls, low-latency streaming, game voice, and P2P file transfer.

Behind a few API calls sit signaling servers, ICE candidates, TURN bills, and firewall policies. This article is an architecture map between specs and production debugging.

After reading this post

  • Explain SDP signaling, ICE, and offer/answer order
  • Compare STUN vs TURN and host / srflx / relay candidates
  • Sketch DTLS vs SRTP roles
  • Start signaling and clients in JavaScript, Python, and C++

Table of contents

  1. Protocol overview
  2. How it works
  3. Hands-on programming
  4. Performance characteristics
  5. Real-world use cases
  6. Optimization tips
  7. Common problems
  8. Wrap-up

Protocol overview

History and background

WebRTC grew from Google open-source work (~2011) through IETF RTCWEB and W3C. It combines VP8/VP9/H.264, Opus, RTP/RTCP, ICE, STUN, TURN, DTLS-SRTP, … into a browser-interop realtime stack. By 2026 all major browsers support core APIs; iOS Safari is part of the same operational picture.

OSI placement

Not a single layer: transport (UDP/TCP), session/presentation (media/crypto negotiation), and application (signaling) combine. Think “UDP + RTP + security + ICE” for ops and debugging.

Core properties

PropertyDescription
P2P firstPrefer direct UDP paths for media.
NAT traversalICE tries multiple candidate paths.
SecurityDTLS-SRTP—browser policy enforces encryption.
Low latencyAvoids TCP retransmits on the media path by default.
Signaling is app-definedHow SDP is exchanged is not fixed by WebRTC—WebSocket, HTTP, Firebase, …

How it works

Signaling

Signaling exchanges SDP for codecs, hints, and DTLS fingerprints—the control plane. The standard does not mandate HTTPS vs WebSocket, but HTTPS / Secure WebSocket are typical. Media does not flow over signaling.

ICE (Interactive Connectivity Establishment)

ICE gathers host, server-reflexive (STUN), and relay (TURN) candidates, runs connectivity checks, and picks a working path.

STUN / TURN

  • STUN: “What address:port does the public internet see for me?” Enables direct paths at near-zero marginal cost when they work.
  • TURN: When symmetric NATs or firewalls block P2P, media relays through a serverbandwidth and CPU cost—minimize TURN usage in ops.

DTLS

DTLS runs a TLS-like handshake over UDP—used for SCTP/data channels and keying material.

SRTP

SRTP encrypts and authenticates RTP payloads (and parts of headers). Keys tie to SDP fingerprints and DTLS exchanges (details vary by implementation).

sequenceDiagram
  participant A as Peer A
  participant Sig as Signaling (HTTPS/WS)
  participant B as Peer B
  participant S as STUN/TURN
  A->>Sig: SDP offer
  Sig->>B: forward offer
  B->>Sig: SDP answer
  Sig->>A: forward answer
  A->>S: ICE binding / allocate
  B->>S: ICE binding / allocate
  A-->>B: ICE connectivity checks (UDP)
  Note over A,B: DTLS handshake, SRTP media

Hands-on programming

JavaScript (browser, manual SDP paste demo)

Minimal learning demo—production uses a signaling server.

<!DOCTYPE html>
<html>
<head><meta charset="utf-8"><title>WebRTC minimal</title></head>
<body>
  <h3>Peer A</h3>
  <button id="aGo">A: create offer</button>
  <pre id="aOffer"></pre>
  <textarea id="aPasteAnswer" rows="6" cols="80" placeholder="Paste answer SDP here"></textarea>
  <button id="aApply">A: set answer</button>

  <h3>Peer B</h3>
  <textarea id="bPasteOffer" rows="6" cols="80" placeholder="Paste offer SDP here"></textarea>
  <button id="bApplyOffer">B: set offer &amp; create answer</button>
  <pre id="bAnswer"></pre>
  <script src="webrtc-demo.js"></script>
</body>
</html>
// webrtc-demo.js — save next to the HTML above
const cfg = { iceServers: [{ urls: "stun:stun.l.google.com:19302" }] };

const pcA = new RTCPeerConnection(cfg);
const pcB = new RTCPeerConnection(cfg);

pcA.onicecandidate = (e) => {
  if (!e.candidate) console.log("A ICE gathering done");
};
pcB.onicecandidate = (e) => {
  if (!e.candidate) console.log("B ICE gathering done");
};

pcA.ontrack = (e) => console.log("A got track", e.streams[0]);
pcB.ontrack = (e) => console.log("B got track", e.streams[0]);

document.getElementById("aGo").onclick = async () => {
  pcA.addTransceiver("video", { direction: "recvonly" });
  const offer = await pcA.createOffer();
  await pcA.setLocalDescription(offer);
  document.getElementById("aOffer").textContent = offer.sdp;
};

document.getElementById("bApplyOffer").onclick = async () => {
  const sdp = document.getElementById("bPasteOffer").value;
  await pcB.setRemoteDescription({ type: "offer", sdp });
  pcB.addTransceiver("video", { direction: "recvonly" });
  const answer = await pcB.createAnswer();
  await pcB.setLocalDescription(answer);
  document.getElementById("bAnswer").textContent = answer.sdp;
};

document.getElementById("aApply").onclick = async () => {
  const sdp = document.getElementById("aPasteAnswer").value;
  await pcA.setRemoteDescription({ type: "answer", sdp });
};

Use getUserMedia for real camera tracks. Trickle ICE often signals candidates as JSON over the wire.

Python (aiortc sketch)

#!/usr/bin/env python3
# pip install aiortc aiohttp
import asyncio
import json
from aiortc import RTCPeerConnection, RTCSessionDescription
from aiohttp import web

async def offer(request):
    params = await request.json()
    offer_desc = RTCSessionDescription(sdp=params["sdp"], type=params["type"])
    pc = RTCPeerConnection()
    await pc.setRemoteDescription(offer_desc)
    answer = await pc.createAnswer()
    await pc.setLocalDescription(answer)
    return web.json_response(
        {"sdp": pc.localDescription.sdp, "type": pc.localDescription.type}
    )

app = web.Application()
app.router.add_post("/offer", offer)
web.run_app(app, port=8080)

Production SFUs often use Janus, mediasoup, LiveKit, …

C++ (split signaling)

Native media often uses libwebrtc (large) or libdatachannel / GStreamer webrtcbin. A common split: C++ signaling only (e.g. WebSocket++/Beast), browser/mobile for media.

Errors and timeouts

  • Watch iceconnectionstate / connectionState—on failed, retry or switch to TURN.
  • Signaling: handle HTTP 5xx and WS drops with backoff.

Performance characteristics

Latency

Direct P2P can beat one-hop SFU when paths are good.TURN far from users increases RTT.

Throughput

Congestion control (e.g. GCC) adapts bitrate from RTCP feedback. Simulcast sends multiple resolutions for SFU subscribers.

Overhead

UDP + RTP + SRTP + RTCP + ICE keepalives. TURN adds server CPU and bandwidth.

Measurement

Use Chrome webrtc-internals and logs for RTT, jitter, packetsLost.


Real-world use cases

Video conferencing

Commercial products use custom SFU infrastructure but often expose standard WebRTC APIs in browsers.

P2P file sharing

Data channels (SCTP/DTLS) send file chunks—design concurrency, retransmit, and backpressure for large files.

Live streaming

Interactive low-latency fits WebRTC; one-to-many broadcast often pairs with HLS/DASH in hybrid designs.


Optimization tips

Minimize TURN

  • Place regional TURN close to users.
  • TCP/TLS TURN fallback when UDP is blocked increases delay and cost—align with network policy.

Bandwidth adaptation

Cap max bitrate and frame rate in app logic when stacks allow—stability improves.

Simulcast

Simulcast + SFU avoids sending one huge bitrate to every subscriber.


Common problems

NAT traversal fails

Symmetric NAT / enterprise firewallsTURN required. If only mDNS local candidates appear, check VPN and subnet settings.

Firewall issues

UDP blockedTURN over TLS on allowed ports. Corporate proxies may block WebSocket signaling443-only designs help.

Security and compliance

  • Short-lived TURN credentials are standard.
  • Recording needs explicit media path design (e.g. SFU recording).

Wrap-up

Summary

  • WebRTC = signaling (often TCP) + ICE + UDP media + DTLS/SRTP.
  • STUN enables direct paths; TURN is the last-resort relay—understand cost.
  • Browser RTC stats should be part of every serious deployment.

When to choose WebRTC

  • Browser-to-browser realtime audio, video, and data—the default stack. Pair with the TCP guide and UDP guide for transport fundamentals.