3. File Structure

File Structure (slides 22–23)

An audio-visual container or wrapper describes how the different audio and video streams coexist in one file. The container may also contain other elements, for example subtitles.

A codec encodes a data stream or signal for transmission or storage, and decodes it for playback or editing. Both an audio codec and a video codec are needed for an audio-visual content.

Audio-Visual Container (slide 24)

MP4, MOV and AVI can all hold uncompressed video, what we wish for archival preservation purposes. And the uncompressed video is identical on a bit level; only the related metadata are different. Therefore it doesn’t really matter which container is chosen for uncompressed video, however MP4 is open, and MOV and AVI are proprietary.

MXF (Material eXchange Format) is used, for example, for the DCP.

Matroska is an open container that can be used for archiving.

The Flash container has been designed for streaming and therefore cannot be used for archiving.

Video Codec (slide 25)

The open video codec H.264 is mostly used in its numerous compressed flavours, but H.264 can also code uncompressed video.

The video codec families Apple ProRes 422 and Avid DNxHD are often used for professional postproduction and became the de facto standard in the industry.

The open video codec FFV1 can be used with the open container Matroska.

Audio Codec (slide 26)

BWF (Broadcast Wave Format) is an extension of WAVE, defined by the EBU (European Broadcast Union) and recommended also by the ABU (Asia-Pacific Broadcasting Union) for archiving and professional postproduction.

The open audio codec FLAC can be used with the open container Matroska.

AAC (Advanced Audio Coding) and MP3 are often used for streaming, because they give compact files. As those files are compressed with data loss, they are useful for access, but not for postproduction nor for archiving.

Raw Data (slide 27)

Data is anything but “raw”.

Audio Data (slide 28)

Many different parameters should be considered regarding the raw audio data. Yet here we mention only the channel layout, which specifies the spatial disposition of the channels in a multi-channel audio stream.

Video Data (slide 29)

Many different parameters should be considered in regarding the raw video data. Yet here we briefly discuss only the colour schema, often called pixel format. Current colour schemata include:1

  • rgb48le = RGB with 16 bit per primary, in total 48 bit per pixel
  • yuv444p16le = Y′CBCR 4:4:4 with 16 bit for luma and 16 bit for each chroma, in total 48 bit per pixel
  • rgb24 = R′G′B′ with 8 bit per channel, in total 24 bit per pixel2
  • yuv422p10le = Y′CBCR 4:2:2 with 10 bit for luma and 10 bit for both chroma combined, in total 20 bit per pixel
  • uyvy422 = Y′CBCR 4:2:2 with 8 bit for luma and 8 bit for both chroma combined, in total 16 bit per pixel
  • yuv420p = Y′CBCR 4:2:0 with 8 bit for luma and 4 bit for both chroma combined, in total 12 bit per pixel
“Image Codec” (slide 30)

The “image codec” is not a term used in the literature. I use it for moving images stores as single image files, to make a difference with the video stream.

Single image based formats include:

  • TIFF is well known in the graphic industry and photography.
  • DPX is the primary choice of the movie cinema and television production workflows.
  • JPEG 2000 provides a lossless compression of a factor of approximately 2.3 and is recommended by the FIAF (Fédération internationale des Archives du Film, International Federation of Film Archives).
  • OpenEXR has been developed for special effects and, therefore, has a wider dynamic range. It means that this is an excellent format for film restoration work!
“Image Container” (slide 31)

The “image container” is not a term used in the literature. It could be as simple as the folder, in which for example all the TIFF files of a single film reel are stored.

Some film archives recommend tar (tape archiver), a command line and file format. This is old technology we don’t recommend.

The container MXF (Material eXchange Format) is often used, especially when working with JPEG 2000 images.3 The container Motion JPEG could also be used, as recommended by the FIAF, but this is more complex as the use of MXF. Therefore we recommend MXF.


The naming is a little confusing, because in a movie cinema context all bits per pixel are counted (RGB and R′G′B′), but in a video context the number of bits per colour channel is indicated (Y′CBCR). In addition “le” means little endian and “p” here means planar – not progressive.
This is exactly the RGB24 discussed in slide number 14.
Example: The DCP, used today to screen films in a theatre, has the container MXF and the image codec JPEG 2000.