MP4 Container: Complete Guide | ISO BMFF, moov, mdat, fMP4 & FFmpeg
이 글의 핵심
MP4 (ISO BMFF) is the closest thing to a universal container—box layout, streaming optimization, and fMP4 in one flow.
Introduction
MP4 is a familiar extension; the precise mental model is ISO Base Media File Format (ISO BMFF, ISO/IEC 14496-12) plus MPEG-4 system conventions (e.g. 14496-14). It is the default box for H.264/H.265/AV1 video with AAC audio—what mobile, OTT, web, and editors expect.
In streaming ops, where moov lives, whether the file is fMP4, and whether compatible brands (ftyp) match players drive playback failures, buffering, and time-to-first-frame. This article gives the terms and FFmpeg fixes you need without manually parsing every bit.
After reading this post
- Explain the ISO BMFF box tree (ftyp, moov, mdat)
- Distinguish progressive playback (faststart) from fragmented MP4 (HLS/DASH)
- Remux, retag, and map tracks with FFmpeg patterns you can copy
- Compare overhead and compatibility vs other containers
Table of contents
- Container overview
- Internal structure
- Hands-on usage
- Performance comparison
- Real-world use cases
- Optimization tips
- Common problems
- Wrap-up
Container overview
History and background
ISO BMFF generalized QuickTime, became MPEG-4 Part 12, and industry profiles like 3GPP and CMAF narrowed interoperability on the same BMFF skeleton.
Technical characteristics
- Box (atom) hierarchy: Data is 4-byte size + 4-byte type (with some extended types).
- Timeline-centric: Video and audio are indexed per sample under trak → mdia → minf → stbl.
- Codec-agnostic:
stsdholds sample descriptions—H.264/HEVC/AV1, AAC, etc. (devices must still decode the codec).
File extensions
| Extension | Typical use |
|---|---|
| .mp4 | General multimedia (video+audio or video-only) |
| .m4v | Video-centric (sometimes Apple DRM conventions—often interchangeable with mp4) |
| .m4a | AAC-only (or ALAC, etc.) BMFF audio |
Prefer ffprobe over trusting extensions alone.
Internal structure
Key boxes
- ftyp: File type—major brand and compatible brands declare what the file claims to be.
- moov: Movie—metadata, timeline, track map. May include uuid extensions.
- mdat: Media data—compressed samples (video NAL units, audio frames, …).
- moof / mdat (repeated): Fragmented MP4 alternates segment metadata (moof) and media (mdat).
If moov follows mdat, the player may need to read or scan the end before playback—slow start on the web. Faststart moves moov earlier.
Metadata
- Track metadata: Language, timescale, sample tables live under moov.
- User tags: ilst-style atoms (©nam, ©ART, …); some tools embed XMP in uuid boxes.
- Subtitles: Often a separate track or sidecar—compatibility varies.
Classic MP4 vs fMP4
flowchart TB
subgraph classic [Classic MP4]
ftyp1[ftyp]
moov1[moov: tracks & timeline]
mdat1[mdat: compressed samples]
ftyp1 --> moov1 --> mdat1
end
subgraph frag [Fragmented MP4]
ftyp2[ftyp / styp]
moov2[moov: initialization]
moofA[moof: segment A]
mdatA[mdat A]
moofB[moof: segment B]
mdatB[mdat B]
ftyp2 --> moov2 --> moofA --> mdatA --> moofB --> mdatB
end
Hands-on usage
Assume FFmpeg is installed (ffmpeg -version).
Inspect structure
ffprobe -hide_banner -show_format -show_streams -show_entries format_tags input.mp4
For raw box trees, tools like mp4dump (Bento4) or AtomicParsley speed debugging.
Remux without re-encoding
ffmpeg -i input.mkv -c copy -movflags +faststart output.mp4
-c copy requires compatible codecs (e.g. MKV with H.264+AAC usually remuxes cleanly).
Encode while packaging for the web
ffmpeg -i input.mov -c:v libx264 -crf 20 -preset medium -pix_fmt yuv420p \
-c:a aac -b:a 192k -movflags +faststart output.mp4
Fragmented MP4 (ABR source)
ffmpeg -i input.mp4 -c copy -f mp4 -movflags frag_keyframe+empty_moov+default_base_moof fragmented.mp4
Real HLS/DASH packagers also align segment length, keyframes, and master playlists—this is a minimal illustration.
Add metadata
ffmpeg -i input.mp4 -c copy -metadata title="Demo video" -metadata artist="pkglog" tagged.mp4
Rich tagging may be easier with AtomicParsley-class tools.
Faststart on an existing file
ffmpeg -i input.mp4 -c copy -movflags +faststart output.mp4
Moves moov without re-encoding video/audio.
Multiple tracks (audio + subtitles)
ffmpeg -i main.mp4 -i alt_audio.m4a -i subs.srt \
-map 0:v:0 -map 0:a:0 -map 1:a:0 -map 2:s:0 \
-c copy -metadata:s:a:1 language=eng -metadata:s:s:0 language=kor multi.mp4
Subtitle codec compatibility varies—verify on target players.
Performance comparison
Overhead vs other containers
Index and header overhead is usually well under ~1% of file size; same codec and duration means similar size—differences come from index style, fragmentation, and extra tracks.
Streaming fit
| Aspect | MP4 |
|---|---|
| Progressive HTTP | Excellent with faststart |
| HLS/DASH | fMP4 + CMAF is the mainstream axis |
| Low-latency live | Segmented fMP4 beats one huge MP4 |
Compatibility
- Mobile: Native players expect MP4.
- Web: MSE commonly uses fMP4.
- Editing: NLEs open MP4 well, but long GOP and VFR can still confuse timelines—agree on intermediate codecs in pipelines.
Real-world use cases
Streaming (HLS, DASH)
- HLS segments may be TS or fMP4—modern stacks favor fMP4 + CMAF.
- DASH typically uses fMP4 under Period/AdaptationSet—understand init.mp4 vs media segments for incident analysis.
Web browsers
- Single-file delivery: H.264 + AAC + yuv420p remains lowest friction (codec licensing separate).
- MSE players often append fMP4 fragments.
Mobile apps
- AVPlayer / ExoPlayer treat MP4 as first-class—good for offline cache and VOD downloads.
Archiving
- Container does not replace encode settings for quality/size. MP4 offers broad tool support for index stability and metadata. Many teams keep edit-friendly masters (MOV/ProRes, etc.) and ship MP4 for delivery.
Optimization tips
Streaming (moov placement)
- Web + CDN: default faststart (moov first).
- Live segments: moof carries per-segment metadata—align packager settings with GOP length.
Smaller files
- The container rarely saves dramatic space—tune video CRF/bitrate, audio bitrate, and drop unused tracks:
ffmpeg -i input.mp4 -map 0:v:0 -map 0:a:0 -c copy slim.mp4
Compatibility
- Constrain H.264 [email protected] (or your target profile) for old devices.
yuv420pis the usual lowest common denominator for web and mobile.
Common problems
Won’t play in browser
- Unsupported codec: MP4 with HEVC or AC3 may fail—test H.264 + AAC.
- moov at end: try faststart.
Metadata lost
-c copyremuxes can still drop tags depending on tool—keep critical metadata in sidecar JSON if needed.
Codec issues
- VFR can confuse editors—re-encode to CFR when required.
- Heavy B-frames and reference frames may break old hardware—constrain profile.
Wrap-up
Summary
- MP4 on ISO BMFF is the default delivery container—master moov/mdat and fMP4 for streaming.
- FFmpeg remux, faststart, and tags fix many “encode looks fine but won’t play” issues.
- Codec and bitrate decide quality/size; MP4 is the packaging that reaches most players.
When to choose MP4
- Cross-platform VOD, mobile downloads, single-file web, and fMP4 inputs for HLS/DASH. For many subtitle and audio tracks, compare the MKV guide; for royalty-friendly web codecs, see the WebM guide.