H.264 (AVC) Video Codec: Complete Guide | Profiles, FFmpeg & Streaming

H.264 (AVC) Video Codec: Complete Guide | Profiles, FFmpeg & Streaming

이 글의 핵심

H.264 (AVC) remains the most widely deployed video codec—compatibility, tooling, and practical FFmpeg settings in one place.

Introduction

H.264 / AVC (Advanced Video Coding) has been the most widely deployed lossy video codec since the mid-2000s. From broadcast and OTT to mobile cameras, conferencing, and game capture, hardware decoders are ubiquitous, so “encode once, play everywhere” still often means H.264.

HEVC and AV1 beat it on compression, and patent and licensing matter for distribution. This article ties the compression pipeline to FFmpeg settings you can use today.

After reading this post

  • Explain H.264 block structure, profiles, and levels in plain language
  • Follow intra/inter prediction → transform & quantize → entropy coding at a high level
  • Balance quality, speed, and compatibility with libx264, NVENC/VAAPI, and similar
  • Place H.264 for YouTube, mobile, and browsers and recognize common failure modes

Table of contents

  1. Codec overview
  2. How compression works
  3. Practical encoding
  4. Performance comparison
  5. Real-world use cases
  6. Optimization tips
  7. Common problems
  8. Wrap-up

Codec overview

History and background

H.264 was standardized by the JVT (Joint Video Team) of ITU-T VCEG and ISO/IEC MPEG, published as MPEG-4 Part 10 and ITU-T H.264. Compared with MPEG-2 and MPEG-4 Part 2 (e.g. DivX/Xvid), it aimed for roughly half the bitrate at the same quality, with rich tools for network streaming and low-latency realtime (slices, reference frames, etc.).

Technical characteristics

  • Macroblock-based: Typically 16×16 luma macroblocks (chroma depends on subsampling, often 8×8).
  • Variable block sizes: Inter prediction uses 16×16 down to 4×4 partitions to track motion.
  • Multiple references and B-frames: Temporal redundancy is removed using past and future frames (B-frame usage depends on profile).
  • Intra prediction: 4×4 and 16×16 modes exploit spatial correlation.
  • Deblocking filter: Reduces block-edge artifacts.

Profiles and levels

Profiles cap which coding tools may be used.

ProfileCharacteristicsTypical use
BaselineLimited B-frames, simpler toolsOlder mobile, realtime streams
MainB-frames, CABAC, etc.Broadcast, general VOD
High8×8 transform, richer chroma, etc.Blu-ray, high-quality archives

Levels bound resolution, frame rate, bitrate, and decoder complexity (e.g. Level 4.1 is common for 1080p high frame rate). Match profile/level to container and device specs.


How compression works

Intra / inter prediction

  • Intra (I-slice/frame): Prediction within the same frame. Keyframes and scene cuts consume more bits.
  • Inter (P/B): Blocks are predicted from decoded frames with motion vectors; only residuals are coded. Simple motion means fewer bits.

Transform & quantization

Residuals are transformed with DCT-like integer transforms (4×4 / 8×8), then quantized so high frequencies and small coefficients are dropped or coarsened. Larger quantization steps shrink files but increase banding and blur. x264’s CRF (Constant Rate Factor) is the practical knob for this quality–size tradeoff.

Entropy coding

H.264 uses CAVLC and CABAC. CABAC compresses better but adds slightly more decode cost (common in Main/High).

Compression pipeline (conceptual)

flowchart LR
  subgraph input [Input]
    YUV[YUV pixels]
  end
  subgraph pred [Prediction]
    Intra[Intra prediction]
    Inter[Inter prediction]
  end
  subgraph residual [Residual]
    TQ[Transform & quantize]
  end
  subgraph bits [Bitstream]
    EC[Entropy coding]
    BS[H.264 NAL units]
  end
  YUV --> Intra
  YUV --> Inter
  Intra --> TQ
  Inter --> TQ
  TQ --> EC
  EC --> BS

Practical encoding

Examples assume FFmpeg is installed (ffmpeg -version).

Quality-first (archive): libx264 + CRF

ffmpeg -i input.mov -c:v libx264 -crf 18 -preset slow -pix_fmt yuv420p \
  -c:a aac -b:a 192k output.mp4
  • -crf: Often 18–23 (lower is higher quality). Archives near 18; web often 20–23.
  • -preset: ultrafastveryslow. Slower presets often improve efficiency at the same CRF.

Target bitrate (streaming)

ffmpeg -i input.mov -c:v libx264 -b:v 5M -maxrate 5M -bufsize 10M \
  -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

VBR-like behavior usually pairs -maxrate with -bufsize.

Two-pass (strict size or bitrate planning)

ffmpeg -y -i input.mov -c:v libx264 -b:v 4M -preset slow -pass 1 -an -f mp4 /dev/null
ffmpeg -i input.mov -c:v libx264 -b:v 4M -preset slow -pass 2 -c:a aac -b:a 192k output.mp4

On Windows use NUL instead of /dev/null.

NVIDIA NVENC (speed priority)

ffmpeg -hwaccel cuda -i input.mov -c:v h264_nvenc -cq 23 -preset p5 \
  -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

Check ffmpeg -h encoder=h264_nvenc—option names vary by driver and GPU generation.

Parameter tuning

GoalDirection
Maximum compatibility-profile:v high -level 4.0, -pix_fmt yuv420p
Low-latency live-tune zerolatency, fewer B-frames, smaller buffer
Film grain-tune film, slightly lower CRF (grain consumes bitrate)
Animation / flat areasConsider -tune animation

Quality vs speed

  • At the same CRF, a slower preset often improves efficiency at the cost of encode time.
  • For realtime, hardware encoders or ultrafast + adequate bitrate are typical.
  • For one-off archives, slow preset + CRF often wins long-term.

Performance comparison

Compression vs other codecs

At similar visual quality, HEVC > H.264 and AV1 ≥ HEVC (content- and encoder-dependent). Treat H.264 as the compatibility baseline and move up when bandwidth or decode support allow.

Encode and decode speed

  • Decode: H.264 hardware decode is common—good power efficiency.
  • Encode: libx264 is efficient; realtime 4K often uses GPU encoders.

Hardware acceleration

  • Intel: Quick Sync (h264_qsv, etc., depends on platform and build)
  • NVIDIA: NVENC h264_nvenc
  • Apple: VideoToolbox (h264_videotoolbox)
  • AMD: AMF (h264_amf)

Verify decode on target devices before shipping.


Real-world use cases

Streaming (YouTube, Netflix, etc.)

  • YouTube re-transcodes uploads; masters are often ProRes or high-bitrate H.264/H.265.
  • Large OTTs add HDR, subtitles, DRM; some legacy clients still need H.264-only ladders.

Mobile apps

  • iOS and Android essentially assume H.264 decode. Older devices may require Baseline/Main—validate profiles.

Web browsers

  • MP4 (H.264 + AAC) plays almost everywhere, including Safari—still the lowest-risk choice for broad compatibility.
  • WebRTC often lists H.264 as a interop codec.

Optimization tips

Smaller files without trashing quality

  • Resolution and frame rate dominate bitrate savings.
  • For audio-only speech, AAC 96–128k; for music-heavy, 192–256k is common.
  • CRF + slower preset reduces wasted bits.

Faster encoding

  • Move preset one or two steps faster (mediumfast).
  • Switch to GPU encode (re-check quality).
  • Scale in-encoder, e.g. -vf scale=-2:1080.

Batch automation

#!/usr/bin/env bash
set -euo pipefail
mkdir -p out
for f in *.mov; do
  ffmpeg -y -i "$f" -c:v libx264 -crf 21 -preset medium -pix_fmt yuv420p \
    -c:a aac -b:a 160k "out/${f%.mov}.mp4"
done

On Windows PowerShell, adapt the loop syntax.


Common problems

Compatibility

  • Profile/level mismatch: Some TVs and STBs reject [email protected]—try -level 4.1.
  • Pixel format: yuv420p is the usual minimum for web and mobile.

Quality issues

  • Upscaled sources will not regain detail at higher bitrate—fix source resolution and exposure first.
  • Low bitrate + fast motion often needs more bitrate, GOP tuning, and B-frame settings—not just a higher profile.

Licensing

  • H.264 is covered by many patents; commercial products may need license review. Using an open-source encoder does not automatically clear all rights.

Wrap-up

Summary

  • H.264 remains the baseline codec for compatibility, ecosystem, and hardware decode.
  • In practice, bundle CRF vs target bitrate, preset, profile/level, and yuv420p for stable results.
  • Before moving to HEVC or AV1, lock device targets and legal/licensing constraints.

When to choose H.264

  • Maximum compatibility for web and mobile, low-latency live with broad device support, and intermediate/delivery formats in editing pipelines. For bandwidth-critical cases, compare with the HEVC guide and AV1 guide.