H.264 (AVC) Video Codec Complete Guide | Profile, FFmpeg, Streaming Practice

H.264 (AVC) Video Codec Complete Guide | Profile, FFmpeg, Streaming Practice

이 글의 핵심

H.264 (AVC) is still the most widely used video codec — compatibility, tool ecosystem, and practical FFmpeg settings all in one.

Introduction

H.264 / AVC (Advanced Video Coding) is the most widely deployed lossy compression video codec since the mid-2000s. From broadcasting to OTT, mobile cameras, video conferencing, and game recording, almost all platforms have hardware decoders built-in, so the practical standard “encode once, play anywhere” still points to H.264.

However, compression efficiency is lower than HEVC and AV1, and deployment strategy considering patent and licensing issues is needed. This guide aims to understand the compression pipeline and connect to FFmpeg settings you can use immediately.

After Reading This

  • Explain H.264’s block structure, profiles, and levels in business language
  • Conceptually grasp Intra/Inter prediction → Transform & Quantization → Entropy Coding flow
  • Balance quality, speed, and compatibility with libx264, NVENC/VAAPI, etc.
  • Organize position and common issues in YouTube, mobile, browser environments

Table of Contents

  1. Codec Overview
  2. Compression Principles
  3. Practical Encoding
  4. Performance Comparison
  5. Real-World Use Cases
  6. Optimization Tips
  7. Common Issues and Solutions
  8. Conclusion

Codec Overview

History and Development Background

H.264 was standardized by the Joint Video Team (JVT), a joint project of ITU-T VCEG and ISO/IEC MPEG, and published as MPEG-4 Part 10 and ITU-T H.264. Designed to achieve roughly half or less bitrate at same quality compared to existing MPEG-2 and MPEG-4 Part 2 (e.g., DivX/Xvid family), it has rich tools (slices, reference frames, etc.) for network streaming and low-latency real-time.

Technical Features

  • Macroblock-based: Typically encodes in 16×16 luma macroblock units (chroma is 8×8, etc., depending on subsampling).
  • Variable block size: Inter prediction subdivides into 16×16 ~ 4×4 partitions to match motion.
  • Multiple references & B-frames: References past and future frames to remove temporal redundancy (B-frame usage varies by profile).
  • Intra prediction: Uses 4×4 and 16×16 modes to exploit spatial correlation.
  • Deblocking filter: Mitigates block boundary artifacts.

Major Profiles and Levels

Profile restricts which coding tools can be used.

ProfileFeatureCommon Use
BaselineLimited B-frames, simpler toolsOld mobile, real-time streams
MainB-frames, CABAC, etc.Broadcasting, general VOD
High8×8 transform, wider color info, etc.Blu-ray, high-quality archive

Level standardizes resolution, frame rate, bitrate limits, etc., decoder complexity (e.g., Level 4.1 often used for 1080p high frame). Matching profile/level compatibility to container or device specs is important.


Compression Principles

Intra / Inter Prediction

  • Intra (I-slice/frame): Predicts only within same frame. Uses many bits for keyframes and after scene changes.
  • Inter (P/B): Predicts from already decoded frames with block-level motion vectors, leaving only residual. Simpler motion reduces bits.

Transform & Quantization

Residual blocks are moved to frequency domain by DCT-family integer transform (4×4, 8×8), then quantization discards or reduces high-frequency and small coefficients. Larger quantization step makes smaller files but increases banding and blur. x264’s CRF (Constant Rate Factor) essentially captures this “quality vs readability” balance in one number, simplifying practical collaboration.

Entropy Coding

H.264 uses CAVLC (context-adaptive variable-length) and CABAC (context-adaptive binary arithmetic). CABAC has better compression efficiency but slightly higher decoding load (common in Main/High).

Compression Pipeline Diagram

flowchart LR
  subgraph input [Input]
    YUV[YUV Pixels]
  end
  subgraph pred [Prediction]
    Intra[Intra Prediction]
    Inter[Inter Prediction]
  end
  subgraph residual [Residual Processing]
    TQ[Transform & Quantize]
  end
  subgraph bits [Bitstream]
    EC[Entropy Coding]
    BS[H.264 NAL Units]
  end
  YUV --> Intra
  YUV --> Inter
  Intra --> TQ
  Inter --> TQ
  TQ --> EC
  EC --> BS

Practical Encoding

Below examples assume FFmpeg is installed (check with ffmpeg -version).

Quality Priority (Archive): libx264 + CRF

ffmpeg -i input.mov -c:v libx264 -crf 18 -preset slow -pix_fmt yuv420p \
  -c:a aac -b:a 192k output.mp4
  • -crf: Usually 18~23 (lower is higher quality). Broadcast archive is around 18, web distribution is 20~23 common.
  • -preset: ultrafast ~ veryslow. Slower often gives slightly better efficiency at same CRF.

Target Bitrate (For Streaming)

ffmpeg -i input.mov -c:v libx264 -b:v 5M -maxrate 5M -bufsize 10M \
  -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

For VBR-like usage, pattern of specifying -maxrate / -bufsize together is common.

2-pass (When File Size and Quality Prediction Important)

# Pass 1
ffmpeg -y -i input.mov -c:v libx264 -b:v 4M -preset slow -pass 1 -an -f mp4 /dev/null

# Pass 2
ffmpeg -i input.mov -c:v libx264 -b:v 4M -preset slow -pass 2 -c:a aac -b:a 192k output.mp4

On Windows, use NUL instead of /dev/null.

NVIDIA NVENC (Speed Priority)

ffmpeg -hwaccel cuda -i input.mov -c:v h264_nvenc -cq 23 -preset p5 \
  -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

-preset option names may differ by driver and GPU generation, check with ffmpeg -h encoder=h264_nvenc.

Parameter Tuning Guide

GoalRecommended Direction
Maximum compatibility-profile:v high -level 4.0, -pix_fmt yuv420p
Low-latency live-tune zerolatency, reduce B-frames, small buffer
Film grain preservation-tune film, slightly lower CRF (grain consumes many bits)
Animation/flatConsider -tune animation

Quality vs Speed Tradeoff

  • One step slower preset at same CRF increases encoding time but often gets closer to same quality and file size.
  • For real-time, hardware encoder or ultrafast + appropriate bitrate is realistic.
  • For one-time archive, slow preset + CRF is often advantageous long-term.

Performance Comparison

Compression Ratio vs Other Codecs (Approximate Trend)

At same visual quality, generally HEVC > H.264, AV1 ≥ HEVC (varies by content and encoder). H.264 is the baseline, and select higher codecs based on whether bandwidth is absolutely insufficient and decoder compatibility is possible.

Encoding and Decoding Speed

  • Decoding: H.264 has HW decoding on almost all devices, good power efficiency.
  • Encoding: libx264 is very efficient for quality, but real-time 4K often uses GPU encoders.

Hardware Acceleration Support

  • Intel: Quick Sync (h264_qsv, etc., depends on platform and FFmpeg build)
  • NVIDIA: NVENC h264_nvenc
  • Apple: VideoToolbox (h264_videotoolbox)
  • AMD: AMF (h264_amf)

Check actual decoding on target devices before deployment.


Real-World Use Cases

Streaming Services (YouTube, Netflix, etc.)

  • YouTube transcodes uploads to multiple codecs and resolutions internally. Upload source is often editing codec (ProRes, etc.) or high-bitrate H.264/H.265.
  • Large OTT like Netflix use complex packaging including HDR, multilingual subtitles, DRM. Legacy supporting only H.264 still exists depending on viewing device.

Mobile Apps

  • iOS and Android both have H.264 decoding as de facto standard. If old devices require Baseline/Main restriction, profile verification is needed.

Web Browser Support

  • MP4 (H.264 + AAC) plays on almost all browsers. If wide compatibility including Safari is needed, H.264 is still lowest risk choice.
  • In WebRTC real-time video, H.264 often appears as compatible codec.

Live Streaming (WebRTC, RTMP)

# Low-latency live encoding
ffmpeg -f v4l2 -i /dev/video0 -c:v libx264 -tune zerolatency \
  -preset veryfast -b:v 2M -maxrate 2M -bufsize 2M \
  -g 60 -c:a aac -b:a 128k -f flv rtmp://server/live/stream
  • -tune zerolatency: Disables B-frames and lookahead for minimal latency
  • -g 60: Keyframe interval (2 seconds at 30fps)
  • -bufsize: Controls VBV buffer size

Optimization Tips

Reduce File Size While Maintaining Quality

  • Adjusting resolution and frame rate first has biggest bitrate reduction.
  • Audio: Rationalize to AAC 96~128k for voice only, 192~256k for music-focused.
  • Reduce “bit waste” with CRF + slow preset.

Improve Encoding Speed

  • One or two steps faster preset (mediumfast).
  • Switch to GPU encoder (quality check required).
  • Handle resolution scale down in encoder (-vf scale=-2:1080).

Batch Processing Automation

#!/usr/bin/env bash
set -euo pipefail
mkdir -p out
for f in *.mov; do
  ffmpeg -y -i "$f" -c:v libx264 -crf 21 -preset medium -pix_fmt yuv420p \
    -c:a aac -b:a 160k "out/${f%.mov}.mp4"
done

On Windows PowerShell, for loop and % expansion syntax differ, so move same logic based on Get-ChildItem.

Advanced Tuning Options

# Film grain preservation
ffmpeg -i input.mov -c:v libx264 -crf 18 -preset slow -tune film \
  -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

# Animation content
ffmpeg -i input.mov -c:v libx264 -crf 20 -preset medium -tune animation \
  -pix_fmt yuv420p -c:a aac -b:a 160k output.mp4

# Fast start for web (moov atom at beginning)
ffmpeg -i input.mov -c:v libx264 -crf 23 -preset medium -pix_fmt yuv420p \
  -c:a aac -b:a 128k -movflags +faststart output.mp4

Common Issues and Solutions

Compatibility Issues

  • Profile/level mismatch: Old TVs and set-top boxes may not accept [email protected], etc. Test by lowering to -level 4.1, etc.
  • Pixel format: Lowest compatibility for web and mobile is usually yuv420p.
# Force compatibility settings
ffmpeg -i input.mov -c:v libx264 -profile:v high -level 4.1 \
  -pix_fmt yuv420p -c:a aac -b:a 128k output.mp4

Quality Degradation Issues

  • Re-encoding upscaled source at high bitrate does not restore detail. Check source resolution and shooting exposure first.
  • Fast motion at low bitrate easily causes blocking, so GOP length and B-frame adjustment along with raising bitrate itself is often fundamental response.
# Increase bitrate for high-motion content
ffmpeg -i input.mov -c:v libx264 -b:v 8M -maxrate 10M -bufsize 20M \
  -g 30 -bf 2 -c:a aac -b:a 192k output.mp4

Licensing Considerations

  • H.264 has extensive known patents, and commercial products and services may need license review depending on internal policy, region, and implementation method. Using open-source encoder alone does not resolve all rights.

Playback Issues

# Check video stream info
ffprobe -v error -select_streams v:0 -show_entries stream=codec_name,profile,level,width,height,r_frame_rate -of default=noprint_wrappers=1 video.mp4

# Check if moov atom is at beginning (for web streaming)
ffprobe -v error -show_entries format=start_time -of default=noprint_wrappers=1:nokey=1 video.mp4

Conclusion

Key Summary

  • H.264 is still the reference codec in terms of compatibility, ecosystem, and hardware decoding.
  • In practice, thinking about CRF vs target bitrate, preset, profile/level, and yuv420p together gives stable results.
  • Before moving to HEVC or AV1, safely determine deployment target devices and legal/licensing conditions first.
  • Maximum compatibility web and mobile distribution, live streaming (low latency, wide devices), intermediate and delivery format in editing pipeline — H.264 remains a strong choice. When bandwidth becomes more important, compare and decide together with HEVC Guide and AV1 Guide.

Quick Command Reference

# High quality archive
ffmpeg -i input.mov -c:v libx264 -crf 18 -preset slow -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

# Web streaming (fast start)
ffmpeg -i input.mov -c:v libx264 -crf 23 -preset medium -pix_fmt yuv420p -c:a aac -b:a 128k -movflags +faststart output.mp4

# Hardware encoding (NVENC)
ffmpeg -hwaccel cuda -i input.mov -c:v h264_nvenc -cq 23 -preset p5 -pix_fmt yuv420p -c:a aac -b:a 192k output.mp4

# Low-latency live
ffmpeg -i input.mov -c:v libx264 -tune zerolatency -preset veryfast -b:v 2M -maxrate 2M -bufsize 2M -g 60 -c:a aac -b:a 128k output.mp4

  • HEVC (H.265) Practical Guide
  • AV1 Next-Generation Codec Guide
  • H.264 vs HEVC vs AV1 Comparison

Keywords

H.264, AVC, Video Codec, FFmpeg, libx264, NVENC, Streaming, Encoding, Profile, Level, CRF, Compression, MPEG-4