MP4 Container: Complete Guide | ISO BMFF, moov, mdat, fMP4 & FFmpeg

MP4 Container: Complete Guide | ISO BMFF, moov, mdat, fMP4 & FFmpeg

이 글의 핵심

MP4 (ISO BMFF) is the closest thing to a universal container—box layout, streaming optimization, and fMP4 in one flow.

Introduction

MP4 is a familiar extension; the precise mental model is ISO Base Media File Format (ISO BMFF, ISO/IEC 14496-12) plus MPEG-4 system conventions (e.g. 14496-14). It is the default box for H.264/H.265/AV1 video with AAC audio—what mobile, OTT, web, and editors expect.

In streaming ops, where moov lives, whether the file is fMP4, and whether compatible brands (ftyp) match players drive playback failures, buffering, and time-to-first-frame. This article gives the terms and FFmpeg fixes you need without manually parsing every bit.

After reading this post

  • Explain the ISO BMFF box tree (ftyp, moov, mdat)
  • Distinguish progressive playback (faststart) from fragmented MP4 (HLS/DASH)
  • Remux, retag, and map tracks with FFmpeg patterns you can copy
  • Compare overhead and compatibility vs other containers

Table of contents

  1. Container overview
  2. Internal structure
  3. Hands-on usage
  4. Performance comparison
  5. Real-world use cases
  6. Optimization tips
  7. Common problems
  8. Wrap-up

Container overview

History and background

ISO BMFF generalized QuickTime, became MPEG-4 Part 12, and industry profiles like 3GPP and CMAF narrowed interoperability on the same BMFF skeleton.

Technical characteristics

  • Box (atom) hierarchy: Data is 4-byte size + 4-byte type (with some extended types).
  • Timeline-centric: Video and audio are indexed per sample under trak → mdia → minf → stbl.
  • Codec-agnostic: stsd holds sample descriptionsH.264/HEVC/AV1, AAC, etc. (devices must still decode the codec).

File extensions

ExtensionTypical use
.mp4General multimedia (video+audio or video-only)
.m4vVideo-centric (sometimes Apple DRM conventions—often interchangeable with mp4)
.m4aAAC-only (or ALAC, etc.) BMFF audio

Prefer ffprobe over trusting extensions alone.


Internal structure

Key boxes

  • ftyp: File typemajor brand and compatible brands declare what the file claims to be.
  • moov: Moviemetadata, timeline, track map. May include uuid extensions.
  • mdat: Media data—compressed samples (video NAL units, audio frames, …).
  • moof / mdat (repeated): Fragmented MP4 alternates segment metadata (moof) and media (mdat).

If moov follows mdat, the player may need to read or scan the end before playback—slow start on the web. Faststart moves moov earlier.

Metadata

  • Track metadata: Language, timescale, sample tables live under moov.
  • User tags: ilst-style atoms (©nam, ©ART, …); some tools embed XMP in uuid boxes.
  • Subtitles: Often a separate track or sidecar—compatibility varies.

Classic MP4 vs fMP4

flowchart TB
  subgraph classic [Classic MP4]
    ftyp1[ftyp]
    moov1[moov: tracks & timeline]
    mdat1[mdat: compressed samples]
    ftyp1 --> moov1 --> mdat1
  end
  subgraph frag [Fragmented MP4]
    ftyp2[ftyp / styp]
    moov2[moov: initialization]
    moofA[moof: segment A]
    mdatA[mdat A]
    moofB[moof: segment B]
    mdatB[mdat B]
    ftyp2 --> moov2 --> moofA --> mdatA --> moofB --> mdatB
  end

Hands-on usage

Assume FFmpeg is installed (ffmpeg -version).

Inspect structure

ffprobe -hide_banner -show_format -show_streams -show_entries format_tags input.mp4

For raw box trees, tools like mp4dump (Bento4) or AtomicParsley speed debugging.

Remux without re-encoding

ffmpeg -i input.mkv -c copy -movflags +faststart output.mp4

-c copy requires compatible codecs (e.g. MKV with H.264+AAC usually remuxes cleanly).

Encode while packaging for the web

ffmpeg -i input.mov -c:v libx264 -crf 20 -preset medium -pix_fmt yuv420p \
  -c:a aac -b:a 192k -movflags +faststart output.mp4

Fragmented MP4 (ABR source)

ffmpeg -i input.mp4 -c copy -f mp4 -movflags frag_keyframe+empty_moov+default_base_moof fragmented.mp4

Real HLS/DASH packagers also align segment length, keyframes, and master playlists—this is a minimal illustration.

Add metadata

ffmpeg -i input.mp4 -c copy -metadata title="Demo video" -metadata artist="pkglog" tagged.mp4

Rich tagging may be easier with AtomicParsley-class tools.

Faststart on an existing file

ffmpeg -i input.mp4 -c copy -movflags +faststart output.mp4

Moves moov without re-encoding video/audio.

Multiple tracks (audio + subtitles)

ffmpeg -i main.mp4 -i alt_audio.m4a -i subs.srt \
  -map 0:v:0 -map 0:a:0 -map 1:a:0 -map 2:s:0 \
  -c copy -metadata:s:a:1 language=eng -metadata:s:s:0 language=kor multi.mp4

Subtitle codec compatibility varies—verify on target players.


Performance comparison

Overhead vs other containers

Index and header overhead is usually well under ~1% of file size; same codec and duration means similar size—differences come from index style, fragmentation, and extra tracks.

Streaming fit

AspectMP4
Progressive HTTPExcellent with faststart
HLS/DASHfMP4 + CMAF is the mainstream axis
Low-latency liveSegmented fMP4 beats one huge MP4

Compatibility

  • Mobile: Native players expect MP4.
  • Web: MSE commonly uses fMP4.
  • Editing: NLEs open MP4 well, but long GOP and VFR can still confuse timelines—agree on intermediate codecs in pipelines.

Real-world use cases

Streaming (HLS, DASH)

  • HLS segments may be TS or fMP4—modern stacks favor fMP4 + CMAF.
  • DASH typically uses fMP4 under Period/AdaptationSet—understand init.mp4 vs media segments for incident analysis.

Web browsers

  • Single-file delivery: H.264 + AAC + yuv420p remains lowest friction (codec licensing separate).
  • MSE players often append fMP4 fragments.

Mobile apps

  • AVPlayer / ExoPlayer treat MP4 as first-class—good for offline cache and VOD downloads.

Archiving

  • Container does not replace encode settings for quality/size. MP4 offers broad tool support for index stability and metadata. Many teams keep edit-friendly masters (MOV/ProRes, etc.) and ship MP4 for delivery.

Optimization tips

Streaming (moov placement)

  • Web + CDN: default faststart (moov first).
  • Live segments: moof carries per-segment metadata—align packager settings with GOP length.

Smaller files

  • The container rarely saves dramatic space—tune video CRF/bitrate, audio bitrate, and drop unused tracks:
ffmpeg -i input.mp4 -map 0:v:0 -map 0:a:0 -c copy slim.mp4

Compatibility

  • Constrain H.264 [email protected] (or your target profile) for old devices.
  • yuv420p is the usual lowest common denominator for web and mobile.

Common problems

Won’t play in browser

  • Unsupported codec: MP4 with HEVC or AC3 may fail—test H.264 + AAC.
  • moov at end: try faststart.

Metadata lost

  • -c copy remuxes can still drop tags depending on tool—keep critical metadata in sidecar JSON if needed.

Codec issues

  • VFR can confuse editors—re-encode to CFR when required.
  • Heavy B-frames and reference frames may break old hardware—constrain profile.

Wrap-up

Summary

  • MP4 on ISO BMFF is the default delivery container—master moov/mdat and fMP4 for streaming.
  • FFmpeg remux, faststart, and tags fix many “encode looks fine but won’t play” issues.
  • Codec and bitrate decide quality/size; MP4 is the packaging that reaches most players.

When to choose MP4

  • Cross-platform VOD, mobile downloads, single-file web, and fMP4 inputs for HLS/DASH. For many subtitle and audio tracks, compare the MKV guide; for royalty-friendly web codecs, see the WebM guide.