Apple ProRes Bitstream Syntax
Apple ProRes Bitstream Syntax
SMPTE REGISTERED
DISCLOSURE DOCUMENT
Email: [email protected]
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Page 2 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Page 3 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Introduction
Apple ProRes is a video compression scheme developed by Apple Inc. for use in workflows that
require high quality and efficient performance. It is an intra-frame codec that can encode
progressive or interlaced frames with arbitrary dimensions and either 4:2:2 or 4:4:4 chroma
sampling. It operates on Y′CbCr video data; the pixel component samples can have bit depths
of 12 or even more bits per sample, which enables ProRes to be used for RGB video data (via
conversion to Y′CbCr) with high quality results. Frames can also include an alpha channel, with
up to 16 bits per alpha sample, which ProRes encodes losslessly.
1 Scope
This SMPTE Registered Disclosure Document (RDD) includes specifications for the Apple
ProRes bitstream syntax, the bitstream element semantics, and the decoding process used
to produce decompressed images. A reference implementation that reads ProRes bitstreams
from a file and decompresses the bitstreams is part of the contribution. Sample bitstreams and
the resulting decompressed images have also been contributed for exercising the reference
implementation. This RDD does not describe the Apple QuickTime file format or the details of
storing ProRes bitstreams in QuickTime files.
2 References
IEEE Std 1180-1990, IEEE Standard Specifications for the Implementations of 8x8 Inverse
Discrete Cosine Transform.
ISO/IEC 13818-2:2000, Information technology — Generic coding of moving pictures and
associated audio information: Video.
Recommendation ITU-R BT.601-7, Studio encoding parameters of digital television for standard
4:3 and wide-screen 16:9 aspect ratios.
Recommendation ITU-R BT.709-5, Parameter values for the HDTV standards for production
and international programme exchange.
Recommendation ITU-R BT.2020-1, Parameter values for ultra-high definition television
systems for production and international programme exchange.
Recommendation ITU-T H.264 (02/2014), Advanced video coding for generic audiovisual
services.
SMPTE ST 2084:2014, High Dynamic Range Electro-Optical Transfer Function of Mastering
Reference Displays.
3 Notation
Page 4 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
x
Division, x ÷ y
y
Page 5 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
x Square root of x
ceil(x) Least integer greater than or equal to x
cos(x) Cosine of x, with x in units of radians
floor(x) Greatest integer less than or equal to x
log2(x) Base-2 logarithm of x
round(x) Integer nearest to x, for example floor(x + 1⁄2). If x = n + 1⁄2 for some integer n,
either n or n+1 is acceptable as the result.
3.7 Constants
3.14159 26535 89793 23846
Page 6 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
8 macroblocks 4 macroblocks
5 Bitstream Syntax
The formal description of ProRes bitstream syntax is provided by the tables in this section. The
syntax tables are similar in style to those in ISO/IEC 13818-2 and ITU-T H.264. In particular,
the tables describe syntax using pseudo-code based on the C programming language.
The syntax description involves several special constructs. Syntax elements are the fundamen-
tal parameters that describe the compressed image or direct the exact nature of the decoding
process. There are two types of syntax elements: bitstream syntax elements and derived
syntax elements. The former appear directly in ProRes bitstreams, while the latter are
calculated from bitstream syntax elements or other derived syntax elements. Both are denoted
by names using all lowercase letters with underscore characters separating words; the names
of bitstream syntax elements appear in boldface in the syntax tables where they occur in the
bitstream (though not in subsequent usage) and in the headings of their semantic descriptions.
Syntax structures are collections of syntax elements, other syntax structures, or both. They are
denoted by functions whose names use all lowercase letters with underscore characters
separating words. The functions can take arguments, which identify a specific instance of the
syntax structure or provide information relevant to the parsing or decoding process of the
structure. The sizes of some syntax structures are specified in ProRes bitstreams as described
subsequently. Decoders shall use these sizes to determine the start of the immediately
following syntax structure. (See Section 6.4, “Bitstream Versions, Version Variants, and
Compatibility,” for more detail.)
Finally, syntax variables and syntax functions are used within syntax structures as part of the
description of those structures. To distinguish them from syntax elements and structures, their
names use a mixture of lowercase and uppercase letters and do not include underscore
characters. Syntax variables are defined implicitly by their use in the syntax tables.
The syntax function endOfData() takes an argument that specifies the size in bytes of a syntax
structure and returns true if the number of remaining bits—the difference between the struc-
Page 7 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
ture’s size in bits and the number of bits representing the preceding bitstream syntax elements
of the structure—is 31 or less and the remaining bits, if any, all have value 0; it returns false if
32 or more bits remain or if any of the remaining bits has the value 1. The syntax function
byteAligned() returns true if the number of bits representing the preceding bitstream syntax
elements of the associated syntax structure is a multiple of eight and returns false if not. The
syntax function endOfStructure() takes an argument that specifies the size in bytes of a syntax
structure and returns true if the number of remaining bits is 0 and false if one or more bits
remain. The syntax function isModuloAlphaDifference(), which returns true or false, is used in
the alpha channel decoding process and is defined in Section 7.1.2, “Scanned Alpha.”
ProRes bitstream syntax elements belong to one of three categories: fixed-length bit strings,
fixed-length numerical values, or variable-length codes. In the syntax tables, the “Descriptor”
column indicates the category to which each bitstream syntax element belongs. Fixed-length bit
strings are denoted by “f(n),” where n is the number of bits in the string. Fixed-length numerical
values are unsigned integers and are denoted by “u(n),” where n is the number of bits used to
represent the value. Variable-length codes are denoted by “vlc.” Bit strings and variable-length
codes appear in the bitstream left bit first; numerical values appear most-significant bit first.
Figure 2 gives an overview of the hierarchy of ProRes bitstream syntax structures.
frame()
Page 8 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
frame() { Descriptor
frame_size u(32)
frame_identifier f(32)
frame_header()
picture(“first”)
if (interlace_mode == 1 || interlace_mode == 2)
picture(“second”)
if (stuffing_size > 0)
stuffing()
}
Page 9 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
frame_header() { Descriptor
frame_header_size u(16)
reserved u(8)
bitstream_version u(8)
encoder_identifier f(32)
horizontal_size u(16)
vertical_size u(16)
chroma_format u(2)
reserved u(2)
interlace_mode u(2)
reserved u(2)
aspect_ratio_information u(4)
frame_rate_code u(4)
color_primaries u(8)
transfer_characteristic u(8)
matrix_coefficients u(8)
reserved u(4)
alpha_channel_type u(4)
reserved u(14)
load_luma_quantization_matrix u(1)
load_chroma_quantization_matrix u(1)
if (load_luma_quantization_matrix) {
for (v = 0; v < 8; v++)
for (u = 0; u < 8; u++)
luma_quantization_matrix[v][u] u(8)
}
if (load_chroma_quantization_matrix) {
for (v = 0; v < 8; v++)
for (u = 0; u < 8; u++)
chroma_quantization_matrix[v][u] u(8)
}
}
Page 10 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
stuffing() { Descriptor
for (m = 0; m < stuffing_size; m++)
zero_byte /* Equal to 0x00 */ f(8)
}
picture(temporalOrder) { Descriptor
picture_header()
slice_table()
for (i = 0; i < height_in_mb; i++)
for (j = 0; j < number_of_slices_per_mb_row; j++)
slice(i, j)
}
picture_header() { Descriptor
picture_header_size u(5)
reserved u(3)
picture_size u(32)
deprecated_number_of_slices u(16)
reserved u(2)
log2_desired_slice_size_in_mb u(2)
reserved u(4)
}
slice_table () { Descriptor
for (i = 0; i < height_in_mb; i++)
for (j = 0; j < number_of_slices_per_mb_row; j++)
coded_size_of_slice[i][j] u(16)
}
Page 11 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
slice(i, j) { Descriptor
slice_header()
numYBlocks = 4 * slice_size_in_mb[j]
if (chroma_format == 3) /* 4:4:4 */
numCBlocks = 4 * slice_size_in_mb[j]
else /* 4:2:2 */
numCBlocks = 2 * slice_size_in_mb[j]
codedYDataSize = coded_size_of_y_data
codedCbDataSize = coded_size_of_cb_data
if (alpha_channel_type != 0)
codedCrDataSize = coded_size_of_cr_data
else
codedCrDataSize = coded_size_of_slice[i][j]
- slice_header_size
- codedYDataSize
- codedCbDataSize
scanned_coefficients(scannedYCoeffs, numYBlocks,
codedYDataSize)
scanned_coefficients(scannedCbCoeffs, numCBlocks,
codedCbDataSize)
scanned_coefficients(scannedCrCoeffs, numCBlocks,
codedCrDataSize)
if (alpha_channel_type != 0) {
if (i < height_in_mb – 1)
sliceVerticalSize = 16
else
sliceVerticalSize = picture_vertical_size
- 16 * (height_in_mb – 1)
numAlphaValues = 16 * slice_size_in_mb[j]
* sliceVerticalSize
scanned_alpha(scannedAlphaValues, numAlphaValues)
}
}
Page 12 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
slice_header() { Descriptor
slice_header_size u(5)
reserved u(3)
quantization_index u(8)
coded_size_of_y_data u(16)
coded_size_of_cb_data u(16)
if (alpha_channel_type != 0)
coded_size_of_cr_data u(16)
}
Page 13 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Page 14 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
6 Bitstream Semantics
frame_identifier
A four-character code that identifies the bitstream as a ProRes frame. This will be ‘icpf’
(0x69637066).
Page 15 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
bitstream_version
The version number of the bitstream. The version number is incremented when a change is
made to bitstream syntax or semantics that breaks compatibility with existing decoders. A
decoder shall abort if it encounters a bitstream with an unsupported bitstream_version value.
If 0, the value of the chroma_format syntax element shall be 2 (4:2:2 sampling) and the value
of the alpha_channel_type element shall be 0 (no encoded alpha); if 1, any permissible value
may be used for those syntax elements.
encoder_identifier
A four-character code that identifies the encoder vendor or product that generated the com-
pressed frame. Apple maintains a registry of the codes for encoder licensees. Decoders
should ignore this element.
horizontal_size
The width of the frame in luma samples.
vertical_size
The height of the frame in luma samples.
chroma_format
A code specifying the sampling format of the frame. Values and their meanings are listed in
Table 1.
chroma_format Meaning
0 Reserved
1 Reserved
2 4:2:2
3 4:4:4
interlace_mode
A code specifying whether the frame is progressive or interlaced (as well as field order in the
latter case). Values and their meanings are listed in Table 2.
interlace_mode Meaning
0 Progressive frame (frame contains one full-height picture)
1 Interlaced frame (first picture is top field)
2 Interlaced frame (second picture is top field)
3 Reserved
aspect_ratio_information
A code indicating the pixel or image aspect ratio. Values and their meanings are listed in
Table 3.
Page 16 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
aspect_ratio_information Meaning
0 Unknown/unspecified
1 Square pixels
2 4:3 image aspect ratio
3 16:9 image aspect ratio
4 Reserved
15 Reserved
frame_rate_code
A code indicating frame rate. Values and their meanings are listed in Table 4.
frame_rate_code Meaning
0 Unknown/unspecified
1 24 ÷ 1.001 (23.976)
2 24
3 25
4 30 ÷ 1.001 (29.97)
5 30
6 50
7 60 ÷ 1.001 (59.94)
8 60
9 100
10 120 ÷ 1.001 (119.88)
11 120
12 Reserved
15 Reserved
color_primaries
A code indicating the chromaticity coordinates of the source primaries and white point. Values
and their meanings are listed in Table 5.
Page 17 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
color_primaries Meaning
0 Unknown/unspecified
x y
Red 0.640 0.330
1 Green 0.300 0.600 (ITU-R BT.709)
Blue 0.150 0.060
White (D65) 0.3127 0.3290
2 Unknown/unspecified
3 Reserved
4 Reserved
x y
Red 0.640 0.330
5 Green 0.290 0.600 (ITU-R BT.601 625)
Blue 0.150 0.060
White (D65) 0.3127 0.3290
x y
Red 0.630 0.340
6 Green 0.310 0.595 (ITU-R BT.601 525)
Blue 0.155 0.070
White (D65) 0.3127 0.3290
7 Reserved
8 Reserved
x y
Red 0.708 0.292
9 Green 0.170 0.797 (ITU-R BT.2020)
Blue 0.131 0.046
White (D65) 0.3127 0.3290
10 Reserved
x y
Red 0.680 0.320
11 Green 0.265 0.690 (DCI P3)
Blue 0.150 0.060
White (DCI) 0.314 0.351
x y
Red 0.680 0.320
12 Green 0.265 0.690 (P3 D65)
Blue 0.150 0.060
White (D65) 0.3127 0.3290
13 Reserved
255 Reserved
Page 18 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
transfer_characteristic
A code indicating the opto-electronic transfer characteristic of the source video data. The
values 0 and 2 mean unknown/unspecified; the value 1 signifies the function specified by
ITU-R BT.601/BT.709/BT.2020, namely
α * L0.45 – (α – 1), β≤L≤1
V=
4.5 * L, 0 ≤ L ≤ β,
where L is normalized linear optical intensity, V is the corresponding non-linear (gamma pre-
corrected) signal value, α = 1.099 296 826 809 44, and β = 0.018 053 968 510 807; and
the value 16 signifies the Inverse-EOTF formula in Section 5.3 of SMPTE ST 2084:2014,
namely
m2
c 1 + c 2 L m1
V= ,
1 + c 3 L m1
where L and V are as above (here L is normalized so that L = 1 corresponds to an absolute
optical intensity of 10,000 cd/m2), m1 = 0.25 * (2610 ÷ 4096), m2 = 128 * (2523 ÷ 4096),
c1 = c3 − c2 + 1 = 3424 ÷ 4096, c2 = 32 * (2413 ÷ 4096), and c3 = 32 * (2392 ÷ 4096). All other
values are reserved.
matrix_coefficients
A code indicating the matrix coefficients used to derive luma and chroma values from the red,
green, and blue primaries. Values and their meanings are listed in Table 6. For values that
specify luma coefficients (KR, KG, and KB), the derivation is E′Y = KR * E′R + KG * E′G + KB * E′B,
E′Cb = (E′B − E′Y) ÷ (2 * (1 − KB)), and E′Cr = (E′R − E′Y) ÷ (2 * (1 − KR)), where E′Y is the
normalized luma value and E′Cb and E′Cr are the normalized chroma values corresponding to
the normalized gamma pre-corrected primary values E′R, E′G, and E′B.
Table 6 – Meaning of matrix_coefficients
matrix_coefficients Meaning
0 Unknown/unspecified
1 KR = 0.2126, KG = 0.7152, KB = 0.0722 (ITU-R BT.709)
2 Unknown/unspecified
3 Reserved
4 Reserved
5 Reserved
6 KR = 0.299, KG = 0.587, KB = 0.114 (ITU-R BT.601)
7 Reserved
8 Reserved
9 KR = 0.2627, KG = 0.6780, KB = 0.0593 (ITU-R BT.2020)
10 Reserved
255 Reserved
Page 19 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
alpha_channel_type
A code specifying the type of alpha channel data encoded in the bitstream, if any. Values and
their meanings are listed in Table 7. If the value of the bitstream_version syntax element
is 0, the value of this element shall be 0. There are no such restrictions for other values
of bitstream_version; in particular the value of this element may be 0 when the value of
bitstream_version is not 0. Note: Slice syntax is affected by the value of this element.
alpha_channel_type Meaning
0 No encoded alpha data present in bitstream
1 8 bits/sample integral alpha
2 16 bits/sample integral alpha
3 Reserved
15 Reserved
load_luma_quantization_matrix
A flag indicating whether a custom luma quantization matrix is specified. If 0, the default
matrix shall be used.
load_chroma_quantization_matrix
A flag indicating whether a custom chroma quantization matrix is specified. If 0, the
luma matrix shall be used (i.e., the specified custom luma quantization matrix if
load_luma_quantization_matrix is 1 or the default matrix otherwise).
luma_quantization_matrix
Custom quantization weight matrix for luma coefficients. Each entry of the matrix will be in the
range 2, 3, , 63.
chroma_quantization_matrix
Custom quantization weight matrix for chroma coefficients. Each entry of the matrix will be in
the range 2, 3, , 63.
zero_byte
An eight-bit number with value zero (0x00). Optionally used to pad the compressed frame up
to a desired size.
Page 20 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
width_in_mb
The width of the encoded picture in macroblocks. Derived from horizontal_size:
width_in_mb = (horizontal_size + 15) / 16
If the encoded picture width in luma samples, 16 * width_in_mb, exceeds horizontal_size,
encoders will append 16 * width_in_mb − horizontal_size additional pixels to the end (right)
of each row of the source picture; decoders shall discard the excess pixel(s) from the end
(right) of each row of the decoded picture. Note: When a frame consists of two pictures,
width_in_mb is the same for each.
height_in_mb
The height of the encoded picture in macroblocks. Derived from picture_vertical_size:
height_in_mb = (picture_vertical_size + 15) / 16
If the encoded picture height in luma samples, 16 * height_in_mb, exceeds
picture_vertical_size, encoders will append 16 * height_in_mb − picture_vertical_size addi-
tional rows of pixels to the end (bottom) of the source picture; decoders shall discard the
excess row(s) of pixels from the end (bottom) of the decoded picture.
slice_size_in_mb
Array of sizes (in macroblocks) of slices within a single macroblock row of the encoded
picture, starting with the first (leftmost) slice and ending with the last (rightmost) one. The
array pertains to all macroblock rows. The entries are calculated as follows:
j = 0
sliceSize = 1 << log2_desired_slice_size_in_mb
numMbsRemainingInRow = width_in_mb
do {
while (numMbsRemainingInRow >= sliceSize) {
slice_size_in_mb[j++] = sliceSize
numMbsRemainingInRow -= sliceSize
}
sliceSize /= 2
} while (numMbsRemainingInRow > 0)
number_of_slices_per_mb_row = j
number_of_slices_per_mb_row
The number of slices in a single macroblock row of the encoded picture. This corresponds to
the number of entries in the slice_size_in_mb array; like that array, it is the same for every
Page 21 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
deprecated_number_of_slices
The product of the picture height in macroblocks and the number of slices per macroblock row
when that product is 65535 or less, otherwise 0. Decoders shall ignore this element.
log2_desired_slice_size_in_mb
The base-2 logarithm of the desired number of macroblocks constituting a slice. Permissible
values for this element are 0, 1, 2, and 3, which correspond respectively to 1, 2, 4, and 8
macroblocks per slice.
coded_size_of_y_data
The size of the compressed luma (Y′) component data in bytes.
coded_size_of_cb_data
The size of the compressed blue chroma (Cb) component data in bytes.
Page 22 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
coded_size_of_cr_data
The size of the compressed red chroma (Cr) component data in bytes. Note: This element is
present only if the value of the alpha_channel_type syntax element is non-zero.
dc_coeff_difference
The difference between the current quantized DC coefficient and the previous one.
run
The number of consecutive zero-valued quantized AC coefficients in the scanned coefficient
array preceding one that is non-zero.
abs_level_minus_1
One less than the absolute value of the non-zero quantized AC coefficient that terminates the
preceding run of zero-valued coefficients.
sign
A code indicating the sign of the non-zero quantized AC coefficient that terminates the preced-
ing run of zero-valued coefficients. A value of 0 means the coefficient is positive; a value of 1
means it is negative.
zero_bit
A single bit with value 0. Used to ensure that the compressed color component data comprise
an integral number of bytes.
zero_byte
An eight-bit number with value zero (0x00). This syntax element serves no useful purpose but
will occasionally appear in ProRes bitstreams produced by older encoders.
run
The number of consecutive occurrences of the current alpha value in the scanned alpha value
array.
zero_bit
A single bit with value 0. Used to ensure that the compressed alpha component data com-
prise an integral number of bytes.
Page 23 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
bitstream versions denote intrinsic changes to the decoding process, while different version
variants correspond to non-essential distinctions.
A new bitstream version is required if a desired change in bitstream syntax or semantics breaks
compatibility with existing decoders, i.e., if existing decoders cannot properly decode such
bitstreams. The bitstream version will be incremented for each such change. A decoder that
can decode a ProRes bitstream with a particular bitstream version shall be able to decode a
bitstream with any earlier (lower) bitstream version. A decoder shall refuse to decode a ProRes
bitstream with an unsupported bitstream version. To maximize decoder compatibility, encoders
should use the lowest bitstream version appropriate for the frame being encoded and the
encoding parameters in effect.
Version variants correspond to the addition of informative data to ProRes bitstreams. Such
additional data will not include information that is required for correct decoding, and furthermore
will be added in a manner that does not prevent correct decoding by existing decoders that
would otherwise be capable of decoding the bitstream. As a consequence all version variants
of a ProRes bitstream version can be decoded by any decoder compatible with that bitstream
version.
ProRes bitstreams contain no explicit identification of version variant. Because unrecognized
version variant data can be present in a ProRes bitstream, for syntax structures with size
specified in the bitstream, decoders shall use the specified size—rather than inference from the
syntax itself—to determine the start of the immediately following syntax structure.
This specification describes bitstream versions 0 and 1. Version 0 bitstreams will have a value
of 2 (4:2:2 sampling) for the chroma_format syntax element and a value of 0 (no encoded alpha)
for the alpha_channel_type element; version 1 bitstreams can have any permissible value for
those elements. No version variants have been defined for either bitstream version.
7 Decoding Process
This section describes the process that a decoder shall follow to reconstruct a frame from a
ProRes bitstream. The process is carried out for each compressed slice in the bitstream and
consists of these steps:
• Entropy decoding is applied to each of the compressed video components of the slice to
produce arrays of scanned color component quantized discrete cosine transform (DCT)
coefficients and, if the ProRes bitstream includes an encoded alpha channel, an array of
raster-scanned alpha values;
• Inverse scanning is applied to each of the scanned color component quantized DCT
coefficient arrays to produce blocks of color component quantized DCT coefficients;
• Inverse quantization is applied to each of the color component quantized DCT coefficient
blocks to produce blocks of color component DCT coefficients;
• An inverse discrete cosine transform (IDCT) is applied to each of the color component
DCT coefficient blocks to produce blocks of reconstructed color component values;
• Each of the reconstructed color component values is converted to an integral sample of
desired bit depth and is written to the appropriate location in the decoded frame buffer
(as are the decoded alpha values, if any).
Page 24 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Page 25 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
order-kRice Golomb-Rice codeword to obtain the encoded symbol n. Otherwise discard the first
lastRiceQ + 1 bits of the codeword (which are ‘0’s), decode the remaining bits as an order-kExp
k
exponential-Golomb codeword, and finally add (lastRiceQ + 1) * 2 Rice to the result to obtain the
encoded symbol n.
n S(n)
0 0
−1 1
+1 2
−2 3
+2 4
−3 5
+3 6
Notice that even symbols correspond to non-negative integers while odd symbols correspond to
negative ones. The inverse mapping is thus
S(n) / 2, S(n) even
n =
((S(n) + 1) / 2), S(n) odd.
7.1.1.3 DC Coefficients
The first quantized DC coefficient in the scanned coefficient array, first_dc_coeff, is determined
by decoding an order-5 exponential-Golomb code codeword from the bitstream, then applying
the inverse of the signed integer-to-symbol mapping S(n) to the decoded symbol.
The remaining quantized DC coefficients in the scanned coefficient array are encoded
differentially. Variable-length coding is done adaptively, with the codebook for one
dc_coefficient_difference syntax element determined by the absolute value of the previous
one. The adaptation is specified in Table 9, where EXP_GOLOMB_CODE(k) denotes the
exponential-Golomb code of order k, RICE_EXP_COMBO_CODE(lastRiceQ, kRice, kExp) denotes
the Golomb-Rice/exponential-Golomb combination code with the indicated parameters, and
previousDCDiff refers to the value of the previous dc_coefficient_difference.
Page 26 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
previousDCDiff Codebook
0 EXP_GOLOMB_CODE(0)
1 EXP_GOLOMB_CODE(1)
2 RICE_EXP_COMBO_CODE(1, 2, 3)
3 and above EXP_GOLOMB_CODE(3)
To determine the value of a dc_coefficient_difference syntax element, use the entry in Table 9
corresponding to the absolute value of previousDCDiff to decode a codeword from the bit-
stream, then apply the inverse of the signed integer-to-symbol mapping S(n) to the decoded
symbol to obtain a signed integer n; dc_coefficient_difference is n if previousDCDiff ≥ 0 and
−n if previousDCDiff < 0. For each scanned coefficient array the value of previousDCDiff
is initially set to 3 (so that the exponential-Golomb code of order 3 is used for the
first dc_coefficient_difference syntax element of the array) and is updated as each
dc_coefficient_difference syntax element is determined.
7.1.1.4 AC Coefficients
The quantized AC coefficients in the scanned coefficient array are run-length encoded. Runs
consist of consecutive array elements with value zero. They are terminated either by a non-
zero array element—a level—or by reaching the end of the array. Only runs that are terminated
by levels are encoded into the bitstream; a final run, if there is one, is implicit.
Variable-length coding of runs and levels is done separately and adaptively. The codebook for
one run syntax element is determined by the value of the previous one according to Table 10.
Table 10 – Codebook adaptation for scanned coefficients run syntax element
previousRun Codebook
0 RICE_EXP_COMBO_CODE(2, 0, 1)
1 RICE_EXP_COMBO_CODE(2, 0, 1)
2 RICE_EXP_COMBO_CODE(1, 0, 1)
3 RICE_EXP_COMBO_CODE(1, 0, 1)
4 EXP_GOLOMB_CODE(0)
5 RICE_EXP_COMBO_CODE(1, 1, 2)
6 RICE_EXP_COMBO_CODE(1, 1, 2)
7 RICE_EXP_COMBO_CODE(1, 1, 2)
8 RICE_EXP_COMBO_CODE(1, 1, 2)
9 EXP_GOLOMB_CODE(1)
10 EXP_GOLOMB_CODE(1)
11 EXP_GOLOMB_CODE(1)
12 EXP_GOLOMB_CODE(1)
13 EXP_GOLOMB_CODE(1)
14 EXP_GOLOMB_CODE(1)
15 and above EXP_GOLOMB_CODE(2)
Page 27 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
The value of a run syntax element is the symbol obtained by decoding a codeword from the
bitstream using the entry in Table 10 corresponding to previousRun (the value of the previous
run syntax element). For each scanned coefficient array the value of previousRun is initially set
to 4 (so that the exponential-Golomb code of order 0 is used for the first run syntax element of
the array) and is updated as each run syntax element is determined.
Levels are encoded in sign-magnitude fashion. The abs_level_minus_1 syntax element pro-
vides the level symbol, which is one less than the absolute value of the level, and the sign
syntax element indicates the sign of the level. The codebook for one abs_level_minus_1 syntax
element is determined by the value of the previous one according to Table 11.
Table 11 – Codebook adaptation for abs_level_minus_1 syntax element
previousLevelSymbol Codebook
0 RICE_EXP_COMBO_CODE(2, 0, 2)
1 RICE_EXP_COMBO_CODE(1, 0, 1)
2 RICE_EXP_COMBO_CODE(2, 0, 1)
3 EXP_GOLOMB_CODE(0)
4 EXP_GOLOMB_CODE(1)
5 EXP_GOLOMB_CODE(1)
6 EXP_GOLOMB_CODE(1)
7 EXP_GOLOMB_CODE(1)
8 and above EXP_GOLOMB_CODE(2)
Page 28 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Run lengths of one are assigned a codeword consisting of a single ‘1’ bit. Run lengths between
2 and 16, inclusive, are assigned five-bit codewords that begin with a ‘0’ bit and contain at least
one ‘1’ bit. Run lengths between 17 and 2048, inclusive, are assigned codewords consisting of
an escape code of five ‘0’ bits followed by an eleven-bit fixed-length code containing the binary
representation of one less than the run length. (Note: Each five-bit codeword also happens to
be the binary representation of one less than the run length to which it corresponds.) The value
of a run syntax element is the run length obtained by decoding a codeword from the bitstream
using Table 12.
Run alpha values are encoded differentially. Each alpha_difference syntax element provides
either the exact or the modulo difference between the alpha value of its corresponding run
and that of the immediately previous run. For the first run, the previous alpha value is taken
to be −1. Reconstruction of the alpha value of a run—denoted simply by alpha—from that
of the immediately previous run—denoted by previousAlpha—and the value of the run’s
alpha_difference syntax element is done according to the formula
alpha = previousAlpha + alpha_difference
when the value of the syntax function isModuloAlphaDifference() is false or else the formula
alpha = (previousAlpha + alpha_difference) & mask
when the value of isModuloAlphaDifference() is true, where mask is either the 8-bit value 0xFF
(when the value of the alpha_channel_type syntax element is 1) or the 16-bit value 0xFFFF
(when alpha_channel_type is 2). A two’s complement representation shall be used for the value
of the sum so that the bitwise-and corresponds to a modulo operation. Note: If it permits a
simpler or more efficient implementation, the bitwise-and (modulo) operation may also be done
when isModuloAlphaDifference() is false, as it has no effect in that case.
Variable-length coding of alpha differences uses one of two different codes depending on the
value of alpha_channel_type. When alpha_channel_type is 1 (8-bit alpha), the code in Table 13
is used.
Page 29 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Differences with absolute value between 1 and 8, inclusive, are assigned codewords consisting
of a ‘0’ bit followed first by the three-bit binary representation of one less than the absolute value
of the difference and then by a one-bit indicator of the sign of the difference (‘0’ for positive,
‘1’ for negative). Differences with absolute value greater than 8, or equal to 0, are assigned
codewords consisting of an escape bit of ‘1’ followed by an eight-bit fixed-length code (FLC)
containing the binary representation of the value of the difference mod 256. To determine
the value of an alpha_difference syntax element and that of the associated syntax function
isModuloAlphaDifference() when alpha_channel_type is 1, first observe whether the next bit
of the bitstream is a ‘0’ or a ‘1’. If ‘0’, isModuloAlphaDifference() is false and the subsequent
four bits of the bitstream encode the absolute value and sign of alpha_difference according to
Table 13 and the above description. If ‘1’, isModuloAlphaDifference() is true and the subse-
quent eight bits of the bitstream directly provide the binary representation of alpha_difference.
Note: If it permits a simpler or more efficient implementation, when the first bit of the codeword
representing alpha_difference is ‘1’ the full nine-bit codeword may also be used as the binary
representation of alpha_difference; the effect of including the escape bit will be eliminated by
the bitwise-and (modulo) operation in the reconstruction calculation.
When alpha_channel_type is 2 (16-bit alpha), the code in Table 14 is used for alpha differences.
Differences with absolute value between 1 and 64, inclusive, are assigned codewords con-
sisting of a ‘0’ bit followed first by the six-bit binary representation of one less than the absolute
value of the difference and then by a one-bit indicator of the sign of the difference (‘0’ for
Page 30 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
positive, ‘1’ for negative). Differences with absolute value greater than 64, or equal to 0, are
assigned codewords consisting of an escape bit of ‘1’ followed by a sixteen-bit fixed-length
code containing the binary representation of the value of the difference mod 65536. To
determine the value of an alpha_difference syntax element and that of the associated syntax
function isModuloAlphaDifference() when alpha_channel_type is 2, first observe whether the
next bit of the bitstream is a ‘0’ or a ‘1’. If ‘0’, isModuloAlphaDifference() is false and the
subsequent seven bits of the bitstream encode the absolute value and sign of alpha_difference
according to Table 14 and the above description. If ‘1’, isModuloAlphaDifference() is true and
the subsequent sixteen bits of the bitstream directly provide the binary representation of
alpha_difference. Note: If it permits a simpler or more efficient implementation, when the first bit
of the codeword representing alpha_difference is ‘1’ the full seventeen-bit codeword may also
be used as the binary representation of alpha_difference; the effect of including the escape bit
will be eliminated by the bitwise-and (modulo) operation in the reconstruction calculation.
0 ••• m •••
0 ••• b •••
Page 31 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
u
0 1 2 3 4 5 6 7
0 0 1 4 5 16 17 21 22
1 2 3 6 7 18 20 23 28
2 8 9 12 13 19 24 27 29
3 10 11 14 15 25 26 30 31
v
4 32 33 37 38 45 46 53 54
5 34 36 39 44 47 52 55 60
6 35 40 43 48 51 56 59 61
7 41 42 49 50 57 58 62 63
When on the other hand the value of interlace_mode is not 0—when the block is part of a field
picture—the interlaced scan pattern of Figure 5 is used.
Page 32 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
u
0 1 2 3 4 5 6 7
0 0 2 8 10 32 34 35 41
1 1 3 9 11 33 36 40 42
2 4 6 12 14 37 39 43 49
3 5 7 13 15 38 44 48 50
v
4 16 18 19 25 45 47 51 57
5 17 20 24 26 46 52 56 58
6 21 23 27 30 53 55 59 62
7 22 28 29 31 54 60 61 63
Setting scan[][] to the relevant scan pattern, inverse block scanning is given by the formula
QF[v][u] = QFS[scan[v][u]]
(0 ≤ u, v < 8).
Page 33 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
quantization_index qScale
1 1
2 2
126 126
127 127
128 128
129 132
130 136
223 508
224 512
The dequantized DCT coefficients are calculated from the quantized ones, the quantization
weights, and the quantization scale factor using the formula
F[v][u] = (QF[v][u] * W[v][u] * qScale) ÷ 8
(0 ≤ u, v < 8). Because the results of this calculation are not always integral, they require a
fixed-point or floating-point representation. Notice that the quantized DCT coefficients, the
quantization weights, and the quantization scale factor are all integral, so the dequantized DCT
coefficients will always be multiples of 1/8; three fraction bits are sufficient to provide exact
representation. At least two fraction bits (i.e., quarter-integer precision) shall be retained for the
subsequent inverse transform.
(0 ≤ x, y < 8), where C(0) = 1 2 and C(n) = 1 for n = 1, , 7. The IDCT follows the conven-
tions of IEEE Std 1180-1990, specifically the expectation that the DCT coefficients have the
range of 12-bit signed integers while the reconstructed values have the range of 9-bit signed
integers.
Either a fixed-point or a floating-point implementation of the IDCT is acceptable. The implemen-
tation shall be capable of accommodating DCT coefficients with at least two bits of fractional
precision. To ensure that the IDCT calculation is sufficiently accurate, the implementation shall
comply with Annex A, “IDCT Implementation Accuracy Qualification.” All fraction bits of the
Page 34 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
reconstructed values arising from the implementation should be preserved for conversion of the
reconstructed values to pixel component samples.
Page 35 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
where alpha is a decoded alpha value, alphaSample is the corresponding pixel alpha compo-
nent sample, and b is the number of bits per pixel component sample.
0 1
2 3
For 4:2:2 sampling (when the value of the chroma_format syntax element is 2), the two Cb
blocks and two Cr blocks are each arranged by block index as shown in Figure 7.
Figure 7 – Order of chroma (Cb, Cr) blocks within macroblock, 4:2:2 sampling
For 4:4:4 sampling (when chroma_format is 3), the four Cb blocks and four Cr blocks are each
arranged by block index as shown in Figure 8. Note: The arrangement of the 4:4:4 chroma
blocks (Figure 8) differs from that of the luma blocks (Figure 6).
Page 36 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
0 2
1 3
Figure 8 – Order of chroma (Cb, Cr) blocks within macroblock, 4:4:4 sampling
Within a slice alpha component data are organized not as macroblocks and blocks but rather
as an array of raster-scanned values. Calling the array alphaValues[], for slice(i, j) the decoded
alpha value corresponding to pixel n of row r is alphaValues[16 * slice_size_in_mb[j] * r + n].
The array includes alpha values—which shall be discarded—for the excess pixel(s) at the end
of each row of slices with j = number_of_slices_per_mb_row – 1 when 16 * width_in_mb >
horizontal_size but does not include alpha values for the excess row(s) of pixels at the bottom of
slices with i = height_in_mb – 1 when 16 * height_in_mb > picture_vertical_size.
Page 37 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
ProRes decoders are not required to use any particular implementation of the inverse discrete
cosine transform (IDCT) calculation. To ensure the quality of decoded image data, however,
only implementations that are sufficiently accurate are acceptable. This annex specifies how to
qualify an IDCT implementation for use in a ProRes decoder.
The qualification procedure is based on the one in Section 3 of IEEE Std 1180-1990. As in
Section 3.2 of that standard, there are seven steps involved in measuring the accuracy of a
proposed IDCT implementation. They are:
(1) Generate random integer data values in the range −L to +H using the random number
generator in the appendix of IEEE Std 1180-1990. Arrange the integer data values into
8×8 blocks of “reference-precision” (i.e., at least 64-bit) floating-point pixels by converting
them to reference-precision floating-point numbers, dividing by 8, and populating the blocks
in row-major order. Data sets of 10,000 blocks each shall be generated for (L = 2048,
H = 2047), (L = H = 40), and (L = H = 2400).
(2) For each 8×8 block of reference-precision floating-point pixels produced by step 1, perform
a separable, orthogonal FDCT (as defined in Eq 1 of IEEE Std 1180-1990) using reference-
precision floating-point arithmetic.
(3) For each 8×8 block of transformed coefficients produced by step 2, round the 64 coeffi-
cients to the nearest quarter-integer by multiplying by 4, rounding to the nearest integer,
and then dividing by 4. Clip the rounded coefficients to the range −2048 to +2047.75.
These blocks, which consist of values that can be represented using a signed binary fixed-
point format with 12 or more integer bits and 2 or more fraction bits, are the input data to
the inverse transforms.
(4) For each 8×8 block of data produced by step 3, perform a separable, orthogonal IDCT
using reference-precision floating-point arithmetic. Retaining full reference precision—
specifically, without rounding to integers or any other lesser precision—clip the resulting
values to the range −256 to +256. These blocks of 8×8 reference-precision floating-point
pixels are the “reference” IDCT output data.
(5) For each 8×8 block of data produced by step 3, perform an IDCT using the proposed
implementation (or a bit-accurate equivalent). Retaining the full precision of the results,
promote them as necessary to reference-precision floating-point numbers and clip those
values to the range −256 to +256. These blocks of 8×8 reference-precision floating-point
pixels are the “test” IDCT output data.
(6) For each of the 64 IDCT output pixels and for each of the 10,000 block data sets generated
by steps 1–5, measure the peak, mean, and mean square errors between the “reference”
data and the “test” data. The error calculation and accumulation shall be carried out with
reference-precision floating-point arithmetic.
(7) Rerun the measurements using exactly the same integer data values of step 1, but change
the sign on each one.
Note: The distinctions between the procedure described above and the one in Section 3.2 of
IEEE Std 1180-1990 are that here the pixel data produced by step 1 have three fraction bits
(rather than being integers), the transformed coefficients produced by step 2 are rounded to
the nearest quarter-integer (rather than integer) and clipped to the range −2048 to +2047.75
(rather than −2048 to +2047), the IDCT output data retain the full precision of their respective
calculations (rather than being rounded to integers) and are clipped to the range −256 to +256
(rather than −256 to +255), and the error measurements use reference-precision floating-point
(rather than integer) arithmetic.
Page 38 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.
SMPTE RDD 36:2015
Using the error term definitions in Section 3.3 of IEEE Std 1180-1990, the (reference-precision
floating-point) errors measured according to the above procedure shall meet the following
criteria:
(1) For any pixel location, the peak error (ppe) shall not exceed 0.15 in magnitude.
(2) For any pixel location, the mean square error (pmse) shall not exceed 0.002.
(3) Overall, the mean square error (omse) shall not exceed 0.001.
(4) For any pixel location, the mean error (pme) shall not exceed 0.0015 in magnitude.
(5) Overall, the mean error (ome) shall not exceed 0.00015 in magnitude.
Page 39 of 39 pages
Authorized licensed use limited to: IEEE Xplore. Downloaded on October 11,2022 at 00:02:19 UTC from IEEE Xplore. Restrictions apply.