Definition of Levels for MPEG-4 Video Profiles

Definition of Levels for MPEG-4 Video Profiles (Stanley posted on November 20th, 2007 )

Table A.1 describes the MPEG-4 Visual levels for the Version 1 and Version 2 profiles only including natural visual (or video) data, this means the so-called MPEG-4 video profiles. Note that Level 0 for the Simple profile has been defined in the 2^nd Extension to the 2^nd Edition of the MPEG-4 Visual standard.

Table A.1 Levels for the MPEG-4 video profiles

Visual profile	Level	Typical visual session size	Max. number of objects ¹	Maximum number objects per type	Max. unique quant. tables	Max. VMV buffer size (MB units)²	Max. VCV buffer size (MB)⁸	VCV decoder rate (MB/s) ⁴	VCV boundary MB decoder rate (MB/s)⁹	Max. total VBV buffer size (units of 16384 bits)⁵	Max. VOL VBV buffer size (units of 16384 bits)	Max. video packet length (bits)⁶	Max. sprite size (MB units)	Wavelet restric�tions	Max. bitrate (kbit/s)	Max. enhancement layers per object
Simple¹⁰	L0	QCIF	1	1 x Simple	1	198	99	1485	N.A.	10	10	2048	N. A.	N. A.	64	N. A.
Simple	L1	QCIF	4	4 x Simple	1	198	99	1485	N.A.	10	10	2048	N. A.	N. A.	64	N. A.
Simple	L2	CIF	4	4 x Simple	1	792	396	5940	N. A.	40	40	4096	N. A.	N. A.	128	N. A.
Simple	L3	CIF	4	4 x Simple	1	792	396	11880	N. A.	40	40	8192	N. A.	N. A.	384	N. A.
Advanced Real Time Simple	L1	QCIF	4	4 x Simple or Adv. Real Time Simple	1	198	99	1485	N.A.	10	10	8192	N. A.	N. A.	64	N. A.
Advanced Real Time Simple	L2	CIF	4	4 x Simple or Adv. Real Time Simple	1	792	396	5940	N. A.	40	40	16384	N. A.	N. A.	128	N. A.
Advanced Real Time Simple	L3	CIF	4	4 x Simple or Adv. Real Time Simple	1	792	396	11880	N. A.	40	40	16384	N. A.	N. A.	384	N. A.
Advanced Real Time Simple	L4	CIF	16	16 x Simple or Adv. Real Time Simple	1	792	396	11880	N. A.	80	80	16384	N. A.	N. A.	2000	N. A.
Simple Scalable	L1	CIF	4	4 x Simple or Simple Scalable	1	1782	495	7425	N. A.	40	40	2048	N. A.	N. A.	128	1 spatial or temporal enhancement layer
Simple Scalable³	L2	CIF	4	4 x Simple or Simple Scalable	1	3168	792	23760	N.A.	40	40	4096	N. A.	N. A.	256	1 spatial or temporal enhancement layer
Core	L1	QCIF	4	4 x Core or Simple	4	594	198	5940	2970	16	16	4096	N. A.	N. A.	384	1
Core	L2	CIF	16	16 x Core or Simple	4	2376	792	23760	11880	80	80	8192	N. A.	N. A.	2000	1
Advanced Core	L1	QCIF	4	4 x Core or Simple or Adv. Scalable Texture	4	594	198	5940	2970	16	8	4096	N. A.	see Table A.5	384	1
Advanced Core	L2	CIF	16	16 x Core or Simple or Adv. scalable Texture	4	2376	792	23760	11880	80	40	8192	N. A.	See Table A.5	2000	1
Core Scalable	L1	CIF	4	4 x Core or Simple or Core scalable or Simple Scalable	4	2376	792	14850	7425	64	64	4096	N.A.	N.A.	768	1
Core Scalable	L2	CIF	8	8 x Core or Simple or Core scalable or Simple	4	2970	990	29700	14850	80	80	4096	N.A.	N.A.	1500	1
Core Scalable	L3	CCIR601	16	16 x Core or Simple or Core scalable or Simple Scalable	4	12906	4032	120960	60480	80	80	16384	N. A.	N. A.	4000	2
Main	L2	CIF	16	16 x Main or Core or Simple	4	3960	1188	23760	11880	80	80	8192	1584	Scalable Texture Profile@L1	2000	1
Main	L3	CCIR 601	32	32 x Main or Core or Simple	4	11304	3240	97200	48600	320	320	16384	6480	Scalable Texture Profile@L1	15000	1
Main	L4	1920 x 1088	32	32 x Main or Core or Simple	4	65344	16320	489600	244800	760	760	16384	65280	Scalable Texture Profile@L2	38400	1
Advanced Coding Efficiency	L1	CIF	4	4 x Adv. Coding Efficiency or Core or Simple	4	1188	792	11880	5940	40	40	8192	N. A.	N. A.	384	1
Advanced Coding Efficiency	L2	CIF	16	16 x Adv. Coding Efficiency or Core or Simple	4	2376	1188	23760	11880	80	80	8192	N. A.	N. A.	2000	1
Advanced Coding Efficiency	L3	CCIR 601	32	32 x Adv. Coding Efficiency or Core or Simple	4	9720	3240	97200	48600	320	320	16384	N. A.	N. A.	15000	1
Advanced Coding Efficiency	L4	1920 x 1088	32	32 x Adv. Coding Efficiency or Core or Simple	4	48960	16320	489600	244800	760	760	16384	N. A.	N. A.	38400	1
N-Bit	L2	CIF	16	16 x Core or Simple or N-Bit	4	2376	792	23760	11880	80	80	8192	N. A.⁷	N. A.	2000	1

Notes:

Enhancement layers are not counted as separate objects.

The maximum VMV (Video Memory Verifier) buffer size is the bound on the memory (in macroblock units) which can be used by the VMV algorithm. This algorithm (see [MPEG4-2; subclause D.5]) models the pixel memory needed by the entire visual decoding process. This includes the memory needed for reference VOPs in the prediction of P, B, and S(GMC)-VOPs and the storage of the reconstructed VOPs until the time they are released by the decoder, plus the memory required to queue B-VOPs until composition occurs. For the profiles that contain more than one layer, the memory requirements include all base and enhancement layers. When belonging to different, overlapping objects, some of these macroblocks may overlay on the display; however separate memory is required (prior to composition) in the VMV.

The conformance point for the base layer of the Simple Scalable Visual profile is the Simple Profile@L1 when Simple Scalable Profile@L1 is used and the Simple Profile@L2 when Simple Scalable Profile@L2 is used.

The VCV (Video Complexity Verifier) decoder rate is the vcv_decoder_rate (H) referred in [MPEG4-2; subclause D.4]; this parameter is the number of macroblocks/second based on the typical spatial and temporal resolutions, as follows:

1485 MBs/s corresponds to QCIF at 15Hz
5940 MBs/s corresponds to CIF at 15 Hz and also twice QCIF at 30 Hz
11880 MB/s corresponds to CIF at 30 Hz
7425 MB/s corresponds to 1.25 times CIF at 15 Hz
23760 MB/s corresponds to twice CIF at 30 Hz
97200 MB/s corresponds to twice ITU-R 601 at 30 Hz
489600 MB/s corresponds to twice 1920×1088 at 30 Hz

The total (aggregated) vbv_buffer_size is the sum of the individual VBV buffer occupancies at any given time (in units of 16384 bits) for all VOLs of all VOs. This total VBV size is limited according to the profile and level.

The maximum video packet length is defined as the maximum number of bits of data_partitioned_motion_shape_texture() in one video packet. The constraint applies only when the data-partitioning tool is enabled in the bitstream. When data partitioning is disabled, there is no limit on the size of video packet length.

N. A. means Not Applicable.

The maximum VCV buffer size (cumulative over all layers of all VOs) is twice the maximum number of macroblocks per VOP in the profile and level combination except for the Simple Visual Profile, Simple Scalable profile (Level 1) and Advanced Real Time Simple Profile. For the Simple Visual Profile and the Advanced Real Time Simple Profile, this value is the maximum number of macroblocks per VOP. For the Simple Scalable profile (Level 1), it is 1.25 times the maximum number of macroblocks per VOP. The limit applies to both the VCV buffer and the boundary MB VCV buffer.

The VCV boundary MB decoder rate column bounds the number of macroblocks containing non trivial shape information (boundary, not transparent nor opaque). The VCV boundary MB decoder rate constrains the total number of boundary MBs in all VOLs, concurrently. Note that the boundary macroblocks are added to both the VCV and boundary MB VCV buffers.

For the Simple Profile@Level 0, the following restrictions apply:

The maximum frame rate shall be 15 frames per second;
The maximum f_code shall be 1;
The intra_dc_vlc_threshold shall be 0;
The maximum horizontal luminance pixel resolution shall be 176 pels/line;
The maximum vertical luminance pixel resolution shall be 144 pels/VOP;
If AC prediction is used, the following restriction applies : QP value shall not be changed within a VOP (or within a video packet if video packets are used in a VOP). If AC prediction is not used, there are no restrictions to changing QP value.

Table A.2 describes the MPEG-4 Visual levels for the Studio profiles defined in the 1^st Extension to the 2^nd Edition of the MPEG-4 Visual standard [MPEG01a].

Table A.2 Levels for the Studio profiles

Visual profile	Level	Typical visual session formats¹	Max. pixel depth	Max. number of objects	Max. number per type	Max. VMV buffer size (sample)²	Max. VCV buffer size (sample)³	VCV decoder rate (sample /s)	VCV boundary MB decoder rate (sample /s)	Max total VBV buffer size	Max VOL VBV buffer size	Max. video packet length (bits)	Max sprite size (sample)⁴	Wavelet restric�tions	Max bitrate (Mbit/s)	Max. enhancement layers per object
Simple Studio	L1	ITU-R601:4224 ITU-R601:444	10	1	1 x Simple Studio	1313280	1313280	33177600	33177600	576	576	N.A.	N.A.	N.A.	180	N.A.
Simple Studio	L2	ITU-R709.60I:422 ITU-R601:444444	10	1	1 x Simple Studio	4194304	4194304	125,829120	125,829120	1920	1920	N.A.	N.A.	N.A.	600	N.A.
Simple Studio	L3	ITU-R709. 60I:444 ITU-R709. 60I:4224	12	1	1 x Simple Studio	6291456	6291456	188,743680	188,743680	2880	2880	N.A.	N.A.	N.A.	900	N.A.
Simple Studio	L4	ITU-R709. 60P:444 ITU-R709. 60I:444444 2Kx2Kx30P:444	12	1	1 x Simple Studio	12582912	12582912	377487360	377487360	4320	4320	N.A.	N.A.	N.A.	1800	N.A.
Core Studio	L1	ITU-R601:4224 ITU-R601:444	10	4	4 x Core Studio or Simple Studio	5253120	2626560	66355200	66355200	576	576	N.A.	8294400	N.A.	90	N.A.
Core Studio	L2	ITU-R709.60I:422 ITU-R601:444444	10	4	4 x Core Studio or Simple Studio	16777216	8388608	251658240	251658240	1920	1920	N.A.	50135040	N.A.	300	N.A.
Core Studio	L3	ITU-R709. 60I:444 ITU-R709. 60I:4224	10	8	8 x Core Studio or Simple Studio	25165824	12582912	377487360	377487360	2880	2880	N.A.	75202560	N.A.	450	N.A.
Core Studio	L4	ITU-R709. 60P:444 ITU-R709. 60I:444444 2Kx2Kx30P:444	10	16	16 x Core Studio or Simple Studio	50331648	25165824	754974720	754974720	4320	4320	N.A.	150994944	N.A.	900	N.A.

Notes:

ITU-R 709 is ITU-R BT. 709 and ITU-R 601 is ITU-R BT. 601; 444444 means 444(RGB) + 3 auxiliary channels; 4224 means 422(YUV)+ 1 auxiliary channel

VMV is defined by the number of samples which belong to the bounding box of texture regardless shape information. VMV also includes auxiliary channel samples.

VCV is defined by the number of samples which belong to the bounding box of texture regardless shape information. VCV also includes auxiliary channel samples.

Maximum sprite size is defined by the number of samples for sprite memory.

Table A.3 describes the MPEG-4 Visual levels for the Advanced Simple and Fine Granularity Scalable profiles defined in the 2^nd Extension to the 2^nd Edition of the MPEG-4 Visual standard [MPEG01b].

Table A.3 Levels for the Advanced Simple and Fine Granularity Scalable (FGS) profiles

Visual profile	Level	Typical visual session size	Max. number of objects	Max. number per type	Max. unique quant. tables	Max. VMV buffer size (MB units)	Max. VCV buffer size (MB)	VCV decoder rate (MB/s)	Max. percentage of intra MBs with AC prediction in VCV buffer	Max total VBV buffer size (units of 16384 bits)	Max. VOL VBV buffer size (units of 16384 bits)	Max. video packet length (bits)	Maximum bitrate (kbit/s) ²	Maximum number of coded VOP-bps ³
Adv. Sim.	L0	176×144	1	1x AS or Simple	1	297	99	2970	100	10	10	2048	128	N.A.
Adv.Sim.	L1	176×144	4	4x AS or Simple	1	297	99	2970	100	10	10	2048	128	N.A.
Adv.Sim.	L2	352×288	4	4x AS or Simple	1	1188	396	5940	100	40	40	4096	384	N.A.
Adv.Sim.	L3	352×288	4	4x AS or Simple	1	1188	396	11880	100	40	40	4096	768	N.A.
Adv.Sim.	L4	352×576	4	4x AS or Simple	1	2376	792	23760	50	80	80	8192	3000	N.A.
Adv.Sim.	L5	720×576	4	4x AS or Simple	1	4860	1620	48600	25	112	112	16384	8000	N.A.
FGS	L0	176×144	1	1x AS or FGS or Simple	1	297	99	2970	100	10	10	2048	128	4
FGS	L1	176×144	4	4x AS or FGS or Simple	1	297	99	2970	100	10	10	2048	128	4
FGS	L2	352×288	4	4x AS or Simple	1	1188	396	5940	100	40	40	4096	384	4
FGS	L3	352×288	4	4x AS or FGS or Simple	1	1188	396	11880	100	40	40	4096	768	4
FGS	L4	352×576	4	4x AS or FGS or Simple	1	2376	792	23760	50	80	80	8192	3000	4
FGS	L5	720×576	4	4x AS or FGS or Simple	1	4860	1620	48600	25	112	112	16384	8000	4

Notes:

The following restriction applies to Level 0 of Advanced Simple profile and FGS profile: if AC prediction is used, the QP value shall not be changed within a VOP (or within a video packet if video packets are used in a VOP). If AC prediction is not used, there are no restrictions to changing the QP value.

For the FGS profile, this column is the maximum base-layer bitrate.

The maximum number of coded VOP-bps takes into consideration the shifted bits after applying frequency weighting and/or selective enhancement.

The number of FGS, FGST, or FGS-FGST layers is always one. If the FGS layer and the FGST layer are separated, the number of total enhancement layers is two.

The interlace tools are not used for levels L0, L1, L2, and L3 of the Advanced Simple and FGS profiles.

It is inherent in the FGS profile that the base and enhancement layers are tightly coupled to each other. To avoid unnecessary memory storage, the following constraints apply to the decoding time relationship of the enhancement layer and the base layer:

Decoding and composition (or presentation in a no-compositor decoder) of each FGS or FGST VOP shall be performed in the same time unit.
Decoding of each FGS and FGST VOP shall be performed immediately after the reference base layer VOP(s) are decoded without violating the above constraint.

COMMENTS: Comments Closed

Comments are closed.

Definition of Levels for MPEG-4 Video Profiles (Stanley posted on November 20th, 2007 )

Categories