AAC Audio Codec: Complete Guide | LC-AAC, HE-AAC & FFmpeg Encoding

AAC Audio Codec: Complete Guide | LC-AAC, HE-AAC & FFmpeg Encoding

이 글의 핵심

AAC profiles, FFmpeg recipes, and streaming practice—high-quality general-purpose audio for HLS, DASH, and MP4.

Introduction

AAC (Advanced Audio Coding) is the MPEG-family successor to MP3, designed for better quality at the same bitrate. With HLS, DASH, and MP4, it is the de facto choice for modern streaming; profile (LC-AAC, HE-AAC) and encoder choices directly affect perceived quality and bandwidth.

This article connects codec basics to reproducible FFmpeg examples and bitrate/container tradeoffs.

After reading this post

  • Understand AAC’s place in MPEG-2/4 and LC-AAC vs HE-AAC
  • Grasp psychoacoustics and MDCT-based block coding at a high level
  • Build FFmpeg command lines for your delivery target
  • Choose bitrates and containers for streaming and mobile

Table of contents

  1. Codec overview
  2. How compression works
  3. Practical encoding
  4. Performance comparison
  5. Real-world use cases
  6. Optimization tips
  7. Common problems
  8. Wrap-up

Codec overview

History and background

AAC was standardized in ISO/IEC 13818-7 (MPEG-2 Part 7) and extended in ISO/IEC 14496-3 (MPEG-4 Audio). Improvements over MP3 include filter banks, threshold partitioning, and tone/noise modeling—more stable quality at lower bitrates. Apple (AAC-LC) and adaptive streaming (AAC in HLS) drove wide adoption.

Technical characteristics

TopicDescription
CompressionPerceptual coding: MDCT-like transform, quantization, lossless codebooks
Sample rateTypically 8 kHz–96 kHz (profile-dependent); 44.1 / 48 kHz dominate
BitrateMusic often 128–256 kbps stereo; HE-AAC helps at lower rates
ChannelsMono through multichannel; streaming often stereo

Main profiles

  • AAC-LC (Low Complexity): Best-supported profile—default for music, podcasts, VOD.
  • HE-AAC v1 (SBR): Replicates high frequencies efficiently—mobile and radio-style streams at low bitrate.
  • HE-AAC v2 (parametric stereo + SBR): Very low stereo bitrates—often speech-heavy content.

Higher profiles can increase decoder cost and battery—do not chase “lowest bitrate” without device coverage.


How compression works

Psychoacoustic model

Human hearing exhibits simultaneous and temporal masking. AAC spends fewer bits on masked components by using coarser quantization there—perceptual bit saving, not arbitrary data deletion.

MDCT (Modified Discrete Cosine Transform)

AAC uses MDCT-based filter banks with overlap-add to reduce block-edge artifacts. Short vs long blocks help with transients.

Bit allocation

Internally, AAC splits a bit pool across bands and allocates to tonal vs noise-like components. At low bitrate, bandwidth limits and line-noise substitution appear. HE-AAC adds an SBR layer for high-frequency efficiency.

Pipeline (simplified)

flowchart LR
  PCM["PCM input"]
  PSY["Psychoacoustic analysis"]
  MDCT["MDCT / filter bank"]
  QUANT["Quantize & codebooks"]
  LOSSLESS["Lossless entropy coding"]
  OUT["AAC bitstream"]
  PCM --> PSY
  PCM --> MDCT
  PSY --> QUANT
  MDCT --> QUANT
  QUANT --> LOSSLESS
  LOSSLESS --> OUT

Practical encoding

Assume WAV (PCM) input unless noted.

AAC-LC CBR stereo 128 kbps at 48 kHz

ffmpeg -i input.wav -c:a aac -b:a 128k -ar 48000 -ac 2 -aac_coder twoloop output.m4a

AAC-LC quality mode (libfdk_aac when available)

Some builds include libfdk_aac. Quality modes vary—verify with -encoders on your FFmpeg.

ffmpeg -i input.wav -c:a libfdk_aac -profile:a aac_low -vbr 4 -ar 44100 -ac 2 output.m4a

AAC-LC for HLS (ADTS raw; segmenting is separate)

ffmpeg -i input.wav -c:a aac -b:a 160k -ar 48000 -ac 2 -f adts aac_160k.aac

Parameter guide

  • Sample rate: If music is 44.1 kHz, avoiding unnecessary resample is usually safest. For video muxing, 48 kHz often simplifies sync and compatibility.
  • Bitrate: 128–192 kbps stereo is common; raise a step for classical or dense mixes.
  • Encoder: Native aac is widely usable; libfdk_aac offers rich options but may be absent due to licensing.

Quality vs file size

192 vs 128 kbps is often a clear step; 256+ kbps may need cost/benefit analysis (bandwidth, CDN). Use ABX listening tests for team baselines.


Performance comparison

Efficiency vs other codecs

At similar listening conditions, AAC-LC often edges MP3 at the same bitrate (content-dependent). Opus excels at low latency and speech; AAC wins legacy device and pipeline compatibility.

Encode and decode speed

  • Decode: Hardware AAC is common on mobile—good playback power.
  • Encode: Native aac is often fast enough for batch; high-quality multipass costs CPU time.

MOS

MOS depends on lab conditions. For services, design listening tests on target devices and headphones.


Real-world use cases

Streaming

Major music services use AAC widely. Spotify and others vary by platform—check current public docs.

Mobile

iOS and Android decode AAC by default. Offline cache and background playback often drive bitrate ladders and HE-AAC A/B tests.

VoIP and WebRTC

Realtime voice usually prefers Opus. AAC fits files and adaptive streaming more than ultra-low-latency RTC.

Browsers

AAC in MP4 (fMP4) is broadly supported under <audio> and MSE. Raw ADTS-only playback is more constrained—MP4 is usually safer on the web.


Optimization tips

Smaller files without trashing quality

  • Avoid pointless resampling (44.1 ↔ 48) to reduce aliasing risk.
  • For low-bitrate mobile, consider HE-AAC if clients support it.

Faster encoding

  • Single-pass CBR keeps batch time predictable.
  • Parallelize with GNU parallel or similar:
find ./wav -name '*.wav' -print0 | parallel -0 ffmpeg -y -i {} -c:a aac -b:a 160k {.}.m4a

CI regression hints

Store input checksums and spectrogram snapshots to catch encoder upgrades.


Common problems

Compatibility

  • HE-AAC-only streams may fail on old devices—ship AAC-LC as a baseline.
  • ADTS vs MP4: Same AAC, different container—player support differs.

Quality

  • Clipping or over-limited masters sound bad in any codec—aim for roughly −1 to −3 dBTP headroom.
  • Harsh hi-hats at low bitrate: raise bitrate or tune encoder short-block behavior.

Licensing

AAC may require patent pool consideration for some products. libfdk_aac has its own license—check redistribution terms.


Wrap-up

Summary

  • AAC-LC is the compatibility baseline; HE-AAC helps at low bitrates.
  • Internally: psychoacoustic model + MDCT block coding removes less audible information first.
  • Ship unified sample rates, bitrate ladders, and MP4 vs ADTS by channel.

When to choose AAC

  • Music, podcasts, VOD: start 128–192 kbps stereo and tune with listening tests.
  • Adaptive streaming: multiple AAC rungs; client picks by network.
  • Maximum compatibility: AAC-LC + MP4 as default, special profiles only where needed.

References