MP4 Container Format Complete Guide | ISO BMFF, moov, mdat, fMP4, FFmpeg Practical

MP4 Container Format Complete Guide | ISO BMFF, moov, mdat, fMP4, FFmpeg Practical

이 글의 핵심

MP4 (ISO BMFF) is the closest to global standard universal container—box structure, streaming optimization, and fMP4 in one flow.

Introduction

MP4 is familiar as a file extension, but technically it’s understood as ISO Base Media File Format (ISO BMFF, ISO/IEC 14496-12) with MPEG-4 system (14496-14, etc.) conventions layered on top. It’s the most common box (container) for bundling H.264/H.265/AV1 video and AAC audio in one file, and the “default format” expected by mobile, OTT, web, and editing tools.

In streaming practice, whether moov comes first, whether it’s segmented fMP4, and whether compatible brands (ftyp) match player expectations directly impact playback failure, buffering, and first-frame delay. This guide bundles terms and FFmpeg-reproducible actions needed for operations, encoding, and distribution discussions without examining bitstreams.

After Reading This

  • Explain ISO BMFF box tree (ftyp, moov, mdat) with diagrams
  • Distinguish progressive playback (faststart) from fragmented MP4 (HLS/DASH) by business criteria
  • Copy and use FFmpeg patterns for remux, metadata, and track mapping
  • Have basis for format selection by comparing compatibility and overhead with other containers

Table of Contents

  1. Container Overview
  2. Internal Structure
  3. Practical Usage
  4. Performance Comparison
  5. Real-World Use Cases
  6. Optimization Tips
  7. Common Issues and Solutions
  8. Conclusion

Container Overview

History and Development Background

ISO BMFF was standardized as MPEG-4 Part 12 through organizing and generalizing QuickTime file format, and later MP4 files mainly specify audio/video track combinations from 14496-14 (MPEG-4 file format) profile perspective. Industrially, 3GPP profiles (mobile) and CMAF (Common Media Application Format) have narrowed interoperability on the same BMFF skeleton.

Technical Features

  • Box (Atom) based: All data is hierarchical structure starting with four-byte size and four-byte type (excluding some extended types).
  • Timeline-centric: Video and audio are indexed per sample with sample table under trak → mdia → minf → stbl.
  • Codec independent: Codec settings (sample description) go in stsd, combining H.264/HEVC/AV1, AAC, etc. (device/browser must support codec for playback).

File Extensions

ExtensionCommon Use
.mp4General multimedia (video+audio or video only)
.m4vVideo-focused (sometimes Apple DRM/convention differences—actually mixed with mp4)
.m4aBMFF file containing AAC audio only (or ALAC, etc.)

In operations, checking actual brand and track configuration with ffprobe is safer than trusting extension alone.


Internal Structure

Core Atom/Box Structure

Frequently encountered boxes at top level:

  • ftyp: File Type Box—declares major brand and compatible brands file follows. Clue for player to gauge “can I understand this variant”.
  • moov: Movie Box—contains metadata, timeline, entire track map. Sometimes uuid extended metadata attaches here.
  • mdat: Media Data Box—compressed actual sample data (video NAL, audio frames, etc.) stored contiguously.
  • moof / mdat (repeated): In Fragmented MP4, segment-unit meta (moof) and media (mdat) pair and repeat.

If moov is after mdat, player must read or scan to end of file to know entire timeline, slowing web playback first start. Moving it forward is the commonly called faststart optimization.

Metadata Storage Method

  • Track meta: Each track’s language, timescale, sample table is in moov.
  • Tags (user meta): ilst family like ©nam, ©ART (some tools put XMP as uuid).
  • Subtitles: Often separate track (e.g., tx3g/WebVTT variant) or parallel with external file.

Structure Diagram (Classic MP4 vs fMP4)

flowchart TB
  subgraph classic [Classic MP4]
    ftyp1[ftyp]
    moov1[moov: tracks & timeline]
    mdat1[mdat: compressed samples]
    ftyp1 --> moov1 --> mdat1
  end
  subgraph frag [Fragmented MP4]
    ftyp2[ftyp / styp]
    moov2[moov: initialization]
    moofA[moof: segment A]
    mdatA[mdat A]
    moofB[moof: segment B]
    mdatB[mdat B]
    ftyp2 --> moov2 --> moofA --> mdatA --> moofB --> mdatB
  end

Practical Usage

Below assumes FFmpeg is installed (ffmpeg -version).

Check Structure

ffprobe -hide_banner -show_format -show_streams -show_entries format_tags input.mp4

For box tree, paralleling mp4dump (Bento4) or AtomicParsley speeds debugging.

Remux Without Codec Change (Container Only)

ffmpeg -i input.mkv -c copy -movflags +faststart output.mp4

-c copy moves streams without re-encoding. Must be compatible codec (e.g., MKV’s H.264+AAC → MP4 usually fine).

Web-Friendly Packaging with Encoding

ffmpeg -i input.mov -c:v libx264 -crf 20 -preset medium -pix_fmt yuv420p \
  -c:a aac -b:a 192k -movflags +faststart output.mp4

Fragmented MP4 (e.g., Adaptive Streaming Source)

ffmpeg -i input.mp4 -c copy -f mp4 -movflags frag_keyframe+empty_moov+default_base_moof fragmented.mp4

Actual HLS/DASH packagers also match segment length, keyframe alignment, and master playlist, so above is minimal example for understanding fMP4 bitstream nature.

Add Metadata (Tags)

ffmpeg -i input.mp4 -c copy -metadata title="Demo Video" -metadata artist="pkglog" tagged.mp4

For richer tags, dedicated tools like AtomicParsley are often convenient.

Streaming Optimization: Apply faststart Post-Encoding

ffmpeg -i input.mp4 -c copy -movflags +faststart output.mp4

Useful for correcting moov position only in already encoded file.

Multiple Tracks (Audio & Subtitles) Example

ffmpeg -i main.mp4 -i alt_audio.m4a -i subs.srt \
  -map 0:v:0 -map 0:a:0 -map 1:a:0 -map 2:s:0 \
  -c copy -metadata:s:a:1 language=eng -metadata:s:s:0 language=kor multi.mp4

Subtitle codec & compatibility varies by player, validation essential.


Performance Comparison

Overhead vs Other Containers

Container overhead (index, header) is typically less than 1% of total file, and perceived size difference is minimal for same codec and length. Differences mainly come from indexing method, segmentation, and extra tracks.

Streaming Suitability

AspectMP4
Progressive HTTPVery suitable with faststart
HLS/DASHfMP4+CMAF is de facto standard axis
Live low latencySegmented fMP4 better than single large MP4

Compatibility Range

  • Mobile: iOS/Android native players support MP4 by default.
  • Web: fMP4 common with MSE (Media Source Extensions).
  • Editing: NLE opens natively well, but Long GOP and variable frame rate can cause timeline issues, needing intermediate codec/profile conventions.

Real-World Use Cases

Streaming Services (HLS, DASH)

  • HLS: Segments can be TS or fMP4. Modern pipelines have significant fMP4+CMAF weight.
  • DASH: fMP4 segments under Period/AdaptationSet are typical. Understanding initialization segment (init.mp4) and media segment separation pattern eases troubleshooting.

Web Browsers

  • For single file distribution, H.264+AAC+yuv420p still has lowest friction (codec licensing separate review).
  • MSE-based players often stitch fMP4 fragments.

Mobile Apps

  • AVPlayer / ExoPlayer both treat MP4 as first priority. Suitable for offline cache and download VOD.

Archiving

  • If not lossless, encoding settings dominate size and quality. From container perspective, MP4 has wide tool support for index stability and meta extension. Many teams keep master as edit-friendly codec+container (e.g., MOV/production codec) and separate delivery/distribution as MP4.

Optimization Tips

Streaming Optimization (moov Position)

  • Web distribution & CDN cache: Make faststart with moov first the default.
  • Live & segments: In fMP4, moof handles segment meta—match packager settings with keyframe interval.

Minimize File Size

  • Container alone makes dramatic reduction difficult. Video CRF/bitrate, audio bitrate, and removing unnecessary tracks come first.
ffmpeg -i input.mp4 -map 0:v:0 -map 0:a:0 -c copy slim.mp4

Improve Compatibility

  • Limit profile/level to target device specs like H.264 High Profile + Level 4.1.
  • Pixel format yuv420p is often fixed as web/mobile lowest common denominator.
# Ensure compatibility
ffmpeg -i input.mp4 -c:v libx264 -profile:v high -level 4.1 -pix_fmt yuv420p \
  -c:a aac -b:a 128k -movflags +faststart compatible.mp4

Common Issues and Solutions

Won’t Play in Browser

  • Codec unsupported: Even MP4, HEVC/AC3 etc. blocked by browser/OS. Re-encode to H.264+AAC or convert audio only to AAC for testing.
  • moov is after: Check if faststart resolves first.
# Check moov position
ffprobe input.mp4 2>&1 | grep -i moov

# Fix with faststart
ffmpeg -i input.mp4 -c copy -movflags +faststart output.mp4

Metadata Disappears

  • Even remuxing with -c copy, some tags may be lost depending on tool. Consider sidecar JSON for important metadata.

Codec Compatibility Issues

  • Variable frame rate (VFR): Sync wobble in editors/players. Re-encode to CFR if possible.
  • Settings with large B-frames and reference frames can cause problems on old hardware, needing profile restrictions.
# Convert VFR to CFR
ffmpeg -i input_vfr.mp4 -c:v libx264 -r 30 -c:a copy output_cfr.mp4

Conclusion

Key Summary

  • MP4 is de facto standard distribution container built on ISO BMFF, and practical core is understanding moov/mdat relationship and transition to fMP4.
  • If you can stably handle remux, faststart, and tags with FFmpeg, you can filter out many encoding problems at container level.
  • Final quality and size are determined by codec and bitrate, and MP4 is like wrapping paper that makes it open everywhere.
  • Prioritize MP4 for cross-platform VOD, mobile download, web single-file distribution, and fMP4 packaging input for HLS/DASH. When many multilingual subtitles and extra tracks, compare with MKV Guide, and for royalty-free web codec combinations, compare with WebM Guide for easier decision.

Quick Command Reference

# Check MP4 structure
ffprobe -hide_banner input.mp4

# Remux with faststart
ffmpeg -i input.mkv -c copy -movflags +faststart output.mp4

# Create fragmented MP4
ffmpeg -i input.mp4 -c copy -f mp4 -movflags frag_keyframe+empty_moov fragmented.mp4

# Add metadata
ffmpeg -i input.mp4 -c copy -metadata title="Title" -metadata artist="Artist" output.mp4

# Multiple audio tracks
ffmpeg -i video.mp4 -i audio_en.m4a -i audio_ko.m4a \
  -map 0:v -map 1:a -map 2:a -c copy \
  -metadata:s:a:0 language=eng -metadata:s:a:1 language=kor multi.mp4

  • Container Format Comparison: MP4 vs MKV vs WebM
  • MKV Practical Guide
  • WebM Web Standard Guide
  • H.264 Video Codec Complete Guide

Keywords

MP4, ISO BMFF, Container, moov, mdat, Fragmented MP4, fMP4, FFmpeg, Streaming, HLS, DASH, faststart