This standard specifies technical scheme of coding and decoding for multichannel digital audio compression, including bit stream format (syntactic structure and semanteme), decoding process and technical requirements of each decoding module; informative suggestion and implementation method are provided for the part adopting coding of this technology.
This standard is applicable to preserve or transmit high-quality multichannel digital audio on the channel with limited storage medium and limited bandwidth, such as digital audio broadcasting, digital TV (including different transmission modes such as satellite, earth and cable transmission), home audio, digital cinema, DVD, network streaming media and personal media player.
2 Normative References
The following standards contain provisions which, through reference in this standard, constitute provisions of this standard. For dated reference, subsequent amendments to (excluding corrigenda), or revisions of, any of these publications do not apply. However, all parties coming to an agreement according to this standard are encouraged to study whether the latest edition of these documents is applicable. For undated references, the latest edition of the normative document applies.
GB/T 17975.1-2000 Information Technology - Generic Coding of Moving Picures and Associated Audio Information - Part 1: Systems (idt ISO/IEC 13818-1: 1996)
GB/T 4880.2-2000 Codes for the Representation of Names of Languages - Part 2: Alpha-3 Code (eqv ISO 639-2: 1998)
ISO/IEC 8859-1: 1998 Information Technology - 8-bit Single-byte Coded Graphic Character Sets - Part 1: Latin Alphabet No. 1
3 Terms, Definitions and Abbreviations
For the purpose of this standard, the following terms, definitions and abbreviations apply.
3.1 Terms and Definitions
3.1.1
Audio data
Bit sequence (data) used to present the original audio signal after coding.
3.1.2
Audio sample
Sample value of PCM (Pulse Code Modulation) of encoder for input or decoder for output.
3.1.3
Auxiliary data
Data not belonging to the audio signal itself but related to the audio signal, including time code etc.
3.1.4
Bit stream
Bit sequence presenting original audio signal generated by the encoder in accordance with this standard.
3.1.5
Brief window function
Window function with total length being 256 samples, but only MDCT (Modified Discrete Cosine Transform) of 160 samples are used.
3.1.6
Critical band
Mathematical model of human ear resolving sound may be approximately presented by a subband filter bank and the bandwidth of filter bank forms approximate index rise along with the rising frequency. A subband of this filter bank is namely a critical band.
3.1.7
Downmix
Matrix calculation of N channels carried out to obtain channel quantity less than N (see Appendix D).
3.1.8
Frame
Audio data presenting one frame of audio signal generated by the encoder in accordance with this standard. It is the basic unit of bit stream in this standard. One frame in this standard may cover 128, 256, 512 or 1024 audio samples.
3.1.9
Frame header
Foreword I Introduction II 1 Scope 2 Normative References 3 Terms, Definitions and Abbreviations 3.1 Terms and Definitions 3.2 Abbreviations 4 General 4.1 Coding 4.2 Decoding 5 Syntactic structure 5.1 Function 5.2 Bit Stream 5.3 Frame 5.4 Frame Header 5.5 Window Sequence 5.6 Codebook Selection and Application Range 5.7 Quantization Index of Subband Sample 5.8 Quantization Step Size Index 5.9 Determination of Sum-Difference Coding 5.10 Scale Factor of Joint Intensity Coding 5.11 Bit Filling 5.12 Auxiliary Data 6 Semanteme 6.1 Bit Stream 6.2 Frame 6.3 Frame Header 6.4 Window Sequence 6.5 Codebook Selection and Application Range 6.6 Quantization Index of Subband Sample 6.7 Quantization Step Size Index 6.8 Determination of Sum-difference Coding 6.9 Scale Factor of Joint Intensity Coding 6.10 Bit Filling 6.11 Auxiliary Data 7 Decoding 7.1 Channel Emission and Arrangement 7.2 Decoding of crossover recombination 7.3 Number of Reconstructed Quantization Unit 7.4 Inverse Quantization 7.5 Joint Intensity Decoding 7.6 Sum/Difference Decoding 7.7 Synthesis Filter Bank with Variable Resolution 7.8 Reconstruction of Short/Brief Window Function Sequence 8 Multiplexing of Audio Stream in MPEG TS Appendix A (Informative) Relevant Coding Technology A.1 Transient Analysis A.2 Human Auditory Model A.3 Global Bit Allocation A.4 Codebook Selection of Huffman Code A.5 Sum/Difference Coding A.6 Joint Intensity Coding Appendix B (Normative) Attached Table Used for Decoding B.1 Quantization Step Size Table B.2 Critical Band Table B.3 Huffman Code Table Used for Decoding Transient Cluster Length B.4 Huffman Code Table Used for Decoding Codebook Selection and Application Range B.5 Huffman Code Table Used for Decoding Quantization Step Size Index B.6 Huffman Code Table Used for Decoding Quantization Index Quotient Width B.7 Huffman Code Table Used for Decoding Stationary Quantization Index B.8 Huffman Code Table Used for Decoding Transient Quantization Index Appendix C (Normative) Multiplexing of Audio Stream in MPEG TS C.1 Stream_ID C.2 Stream_Type C.3 DRA Registration Descriptor C.4 DRA Audio Stream Descriptor C.5 STD Audio Buffer Size C.6 Byte Alignment Appendix D (Informative) Downmix D.1 Downmix Formulas and Coefficients D.1.1 1/0 Mode D.1.2 2/0 Lo/Ro Mode D.1.3 2/0 Lt/Rt Mode D.1.4 Downmix Modes of 3/2/15.1 Surround Sound References Figure 1 Coding Block Diagram Figure 2 Decoding Block Diagram Figure 3 Window Function of Synthesis Filter Bank with Variable Resolution Figure 4 Some Examples Converted by Window Function Figure A.1 Distinction between Codebook Selection Method of Huffman Code in This Standard and Other Technology Figure A.2 Elimination of Some Small Isolated Codebook Segments Table 1 Coding Table 2 Decoding Table 3 Definition of Special Function Table 4 Frame Structure Table 5 Data Structure of Normal Channel Table 6 Data Structure of Low Frequency Enhancement Channel Table 7 Frame Header Type Table 8 Distinction between Two Types of Frame Header Table 9 Bit Number Used for Decoding Frame Length of Audio Data Table 10 Sample Rate Supported in This Standard Table 11 Bit Number Used for Decoding Number of Normal Channel Table 12 Bit Number Used for Decoding Number of Low Frequency Enhancement Channel Table 13 Auxiliary Data Determination of Channel Arrangement Table 14 Determination of Sum-Difference Coding Table 15 Determination of Joint Intensity Coding Table 16 Index of Window Function Table 17 Number of Transient Cluster Table 18 Implied Length of Transient Cluster in Stationary Frame Table 19 Starting Location of the First Transient Cluster and Generation Location of the First Transient Table 20 Selection of Huffman Codebook Used for Decoding Application Range of Codebook Table 21 Selection of Huffman Codebook Used for Decoding Codebook Index Table 22 Selection of Huffman Codebook Used for Decoding Quantization Index Table 23 Selection of Huffman Codebook Used for Decoding Bit Number Needed for Packing Quotient Table 24 Variable Used for Decoding Quantization Index Table 25 Selection of Huffman Codebook Used for Decoding Quantization Step Size Index Table 26 Variable used for Decoding Determination of Sum-Difference Coding Table 27 Determination of Not Using Sum-difference Coding Table 28 Determination of Sum-difference Coding Table 29 Default Normal Channel Arrangement Table 30 Representation Method of Common Channel Arrangement Table 31 Emission Order of Audio Data of Each Channel in Audio Frame Table 32 Emission Order of 5.1 Surround Audio Data in Audio Frame Table 33 Subband Sample Arranged according to Natural Order Table 34 Subband Sample Arranged according to Crossover Recombination Order Table 35 Variable Used for Decoding Crossover Recombination Table 36 Variable Used by Number of Reconstructed Quantization Unit Table 37 Variable Used by Inverse Quantization Table 38 Variable Used by Joint Intensity Decoding Table 39 Variable Used by Sum/Difference Decoding Table 40 Generation Location of Transient and Its Front and Back Optional Window Function Table 41 Variable used for Reconstructing Short MDCT Window Function Sequence Table B.1 Quantization Step Size Table B.2 Critical Band: 8 000Hz, Long Window Table B.3 Critical Band: 8 000Hz, Short Window Table B.4 Critical Band: 11 025Hz, Long Window Table B.5 Critical Band: 11025Hz, Short Window Table B.6 Critical Band: 12 000 Hz, Long Window Table B.7 Critical Band: 12 000 Hz, Short Window Table B.8 Critical Band: 16 000 Hz, Long Window Table B.9 Critical Band: 16 000 Hz, Short Window Table B.10 Critical Band: 22 050 Hz, Long Window Table B.11 Critical Band: 22 050 Hz, Short Window Table B.12 Critical Band: 24 000Hz, Long Window Table B.13 Critical Band: 24 000Hz, Short Window Table B.14 Critical Band: 32 000Hz, Long Window Table B.15 Critical Band: 32 000Hz, Short Window Table B.16 Critical Band: 44 100 Hz, Long Window Table B.17 Critical Band: 44 100Hz, Short Window Table B.18 Critical Band: 48000Hz, Long Window Table B.19 Critical Band: 48 000Hz, Short Window Table B.20 Critical Band: 88 200Hz, Long Window Table B.21 Critical Band: 88 200Hz, Short Window Table B.22 Critical Band: 96 000 Hz, Long Window Table B.23 Critical Band: 96 000Hz, Short Window Table B.24 Critical Band: 176 400Hz, Long Window Table B.25 Critical Band: 176 400 Hz, Short Window Table B.26 Critical Band: 192 000 Hz, Long Window Table B.27 Critical Band: 192000 Hz, Short Window Table B.28 HuffDec1_7x Table B.29 HuffDec2_64x Table B.30 HuffDec3_32x Table B.31 HuffDec4_18x Table B.32 HuffDec5_18x Table B.33 HuffDec6_116x Table B.34 HuffDec7_116x Table B.35 HuffDec8_16x Table B.36 HuffDec9_16x Table B.37 HuffDec10_81x Table B.38 HuffDec11_25x Table B.39 HuffDec12_81x Table B.40 HuffDec13_289x Table B.41 HuffDec14_31x Table B.42 HuffDec15_63x Table B.43 HuffDec16_127x Table B.44 HuffDec17_255x Table B.45 HuffDec18_256x Table B.46 HuffDec19_81x Table B.47 HuffDec20_25x Table B.48 HaffDec21_81x Table B.49 HufFDec22_289x Table B.50 HuffDec23_31x Table B.51 HuffDec24_63x Table B.52 HuffDec25_127x Table B.53 HuffDec26_255x Table B.54 HuffDec27_256x Table C.1 DRA Registration Descriptor Table C.2 DRA Audio Stream Descriptor
1 Scope
This standard specifies technical scheme of coding and decoding for multichannel digital audio compression, including bit stream format (syntactic structure and semanteme), decoding process and technical requirements of each decoding module; informative suggestion and implementation method are provided for the part adopting coding of this technology.
This standard is applicable to preserve or transmit high-quality multichannel digital audio on the channel with limited storage medium and limited bandwidth, such as digital audio broadcasting, digital TV (including different transmission modes such as satellite, earth and cable transmission), home audio, digital cinema, DVD, network streaming media and personal media player.
2 Normative References
The following standards contain provisions which, through reference in this standard, constitute provisions of this standard. For dated reference, subsequent amendments to (excluding corrigenda), or revisions of, any of these publications do not apply. However, all parties coming to an agreement according to this standard are encouraged to study whether the latest edition of these documents is applicable. For undated references, the latest edition of the normative document applies.
GB/T 17975.1-2000 Information Technology - Generic Coding of Moving Picures and Associated Audio Information - Part 1: Systems (idt ISO/IEC 13818-1: 1996)
GB/T 4880.2-2000 Codes for the Representation of Names of Languages - Part 2: Alpha-3 Code (eqv ISO 639-2: 1998)
ISO/IEC 8859-1: 1998 Information Technology - 8-bit Single-byte Coded Graphic Character Sets - Part 1: Latin Alphabet No. 1
3 Terms, Definitions and Abbreviations
For the purpose of this standard, the following terms, definitions and abbreviations apply.
3.1 Terms and Definitions
3.1.1
Audio data
Bit sequence (data) used to present the original audio signal after coding.
3.1.2
Audio sample
Sample value of PCM (Pulse Code Modulation) of encoder for input or decoder for output.
3.1.3
Auxiliary data
Data not belonging to the audio signal itself but related to the audio signal, including time code etc.
3.1.4
Bit stream
Bit sequence presenting original audio signal generated by the encoder in accordance with this standard.
3.1.5
Brief window function
Window function with total length being 256 samples, but only MDCT (Modified Discrete Cosine Transform) of 160 samples are used.
3.1.6
Critical band
Mathematical model of human ear resolving sound may be approximately presented by a subband filter bank and the bandwidth of filter bank forms approximate index rise along with the rising frequency. A subband of this filter bank is namely a critical band.
3.1.7
Downmix
Matrix calculation of N channels carried out to obtain channel quantity less than N (see Appendix D).
3.1.8
Frame
Audio data presenting one frame of audio signal generated by the encoder in accordance with this standard. It is the basic unit of bit stream in this standard. One frame in this standard may cover 128, 256, 512 or 1024 audio samples.
3.1.9
Frame header
Contents of GB/T 22726-2008
Foreword I
Introduction II
1 Scope
2 Normative References
3 Terms, Definitions and Abbreviations
3.1 Terms and Definitions
3.2 Abbreviations
4 General
4.1 Coding
4.2 Decoding
5 Syntactic structure
5.1 Function
5.2 Bit Stream
5.3 Frame
5.4 Frame Header
5.5 Window Sequence
5.6 Codebook Selection and Application Range
5.7 Quantization Index of Subband Sample
5.8 Quantization Step Size Index
5.9 Determination of Sum-Difference Coding
5.10 Scale Factor of Joint Intensity Coding
5.11 Bit Filling
5.12 Auxiliary Data
6 Semanteme
6.1 Bit Stream
6.2 Frame
6.3 Frame Header
6.4 Window Sequence
6.5 Codebook Selection and Application Range
6.6 Quantization Index of Subband Sample
6.7 Quantization Step Size Index
6.8 Determination of Sum-difference Coding
6.9 Scale Factor of Joint Intensity Coding
6.10 Bit Filling
6.11 Auxiliary Data
7 Decoding
7.1 Channel Emission and Arrangement
7.2 Decoding of crossover recombination
7.3 Number of Reconstructed Quantization Unit
7.4 Inverse Quantization
7.5 Joint Intensity Decoding
7.6 Sum/Difference Decoding
7.7 Synthesis Filter Bank with Variable Resolution
7.8 Reconstruction of Short/Brief Window Function Sequence
8 Multiplexing of Audio Stream in MPEG TS
Appendix A (Informative) Relevant Coding Technology
A.1 Transient Analysis
A.2 Human Auditory Model
A.3 Global Bit Allocation
A.4 Codebook Selection of Huffman Code
A.5 Sum/Difference Coding
A.6 Joint Intensity Coding
Appendix B (Normative) Attached Table Used for Decoding
B.1 Quantization Step Size Table
B.2 Critical Band Table
B.3 Huffman Code Table Used for Decoding Transient Cluster Length
B.4 Huffman Code Table Used for Decoding Codebook Selection and Application Range
B.5 Huffman Code Table Used for Decoding Quantization Step Size Index
B.6 Huffman Code Table Used for Decoding Quantization Index Quotient Width
B.7 Huffman Code Table Used for Decoding Stationary Quantization Index
B.8 Huffman Code Table Used for Decoding Transient Quantization Index
Appendix C (Normative) Multiplexing of Audio Stream in MPEG TS
C.1 Stream_ID
C.2 Stream_Type
C.3 DRA Registration Descriptor
C.4 DRA Audio Stream Descriptor
C.5 STD Audio Buffer Size
C.6 Byte Alignment
Appendix D (Informative) Downmix
D.1 Downmix Formulas and Coefficients
D.1.1 1/0 Mode
D.1.2 2/0 Lo/Ro Mode
D.1.3 2/0 Lt/Rt Mode
D.1.4 Downmix Modes of 3/2/15.1 Surround Sound
References
Figure 1 Coding Block Diagram
Figure 2 Decoding Block Diagram
Figure 3 Window Function of Synthesis Filter Bank with Variable Resolution
Figure 4 Some Examples Converted by Window Function
Figure A.1 Distinction between Codebook Selection Method of Huffman Code in This Standard and Other Technology
Figure A.2 Elimination of Some Small Isolated Codebook Segments
Table 1 Coding
Table 2 Decoding
Table 3 Definition of Special Function
Table 4 Frame Structure
Table 5 Data Structure of Normal Channel
Table 6 Data Structure of Low Frequency Enhancement Channel
Table 7 Frame Header Type
Table 8 Distinction between Two Types of Frame Header
Table 9 Bit Number Used for Decoding Frame Length of Audio Data
Table 10 Sample Rate Supported in This Standard
Table 11 Bit Number Used for Decoding Number of Normal Channel
Table 12 Bit Number Used for Decoding Number of Low Frequency Enhancement Channel
Table 13 Auxiliary Data Determination of Channel Arrangement
Table 14 Determination of Sum-Difference Coding
Table 15 Determination of Joint Intensity Coding
Table 16 Index of Window Function
Table 17 Number of Transient Cluster
Table 18 Implied Length of Transient Cluster in Stationary Frame
Table 19 Starting Location of the First Transient Cluster and Generation Location of the First Transient
Table 20 Selection of Huffman Codebook Used for Decoding Application Range of Codebook
Table 21 Selection of Huffman Codebook Used for Decoding Codebook Index
Table 22 Selection of Huffman Codebook Used for Decoding Quantization Index
Table 23 Selection of Huffman Codebook Used for Decoding Bit Number Needed for Packing Quotient
Table 24 Variable Used for Decoding Quantization Index
Table 25 Selection of Huffman Codebook Used for Decoding Quantization Step Size Index
Table 26 Variable used for Decoding Determination of Sum-Difference Coding
Table 27 Determination of Not Using Sum-difference Coding
Table 28 Determination of Sum-difference Coding
Table 29 Default Normal Channel Arrangement
Table 30 Representation Method of Common Channel Arrangement
Table 31 Emission Order of Audio Data of Each Channel in Audio Frame
Table 32 Emission Order of 5.1 Surround Audio Data in Audio Frame
Table 33 Subband Sample Arranged according to Natural Order
Table 34 Subband Sample Arranged according to Crossover Recombination Order
Table 35 Variable Used for Decoding Crossover Recombination
Table 36 Variable Used by Number of Reconstructed Quantization Unit
Table 37 Variable Used by Inverse Quantization
Table 38 Variable Used by Joint Intensity Decoding
Table 39 Variable Used by Sum/Difference Decoding
Table 40 Generation Location of Transient and Its Front and Back Optional Window Function
Table 41 Variable used for Reconstructing Short MDCT Window Function Sequence
Table B.1 Quantization Step Size
Table B.2 Critical Band: 8 000Hz, Long Window
Table B.3 Critical Band: 8 000Hz, Short Window
Table B.4 Critical Band: 11 025Hz, Long Window
Table B.5 Critical Band: 11025Hz, Short Window
Table B.6 Critical Band: 12 000 Hz, Long Window
Table B.7 Critical Band: 12 000 Hz, Short Window
Table B.8 Critical Band: 16 000 Hz, Long Window
Table B.9 Critical Band: 16 000 Hz, Short Window
Table B.10 Critical Band: 22 050 Hz, Long Window
Table B.11 Critical Band: 22 050 Hz, Short Window
Table B.12 Critical Band: 24 000Hz, Long Window
Table B.13 Critical Band: 24 000Hz, Short Window
Table B.14 Critical Band: 32 000Hz, Long Window
Table B.15 Critical Band: 32 000Hz, Short Window
Table B.16 Critical Band: 44 100 Hz, Long Window
Table B.17 Critical Band: 44 100Hz, Short Window
Table B.18 Critical Band: 48000Hz, Long Window
Table B.19 Critical Band: 48 000Hz, Short Window
Table B.20 Critical Band: 88 200Hz, Long Window
Table B.21 Critical Band: 88 200Hz, Short Window
Table B.22 Critical Band: 96 000 Hz, Long Window
Table B.23 Critical Band: 96 000Hz, Short Window
Table B.24 Critical Band: 176 400Hz, Long Window
Table B.25 Critical Band: 176 400 Hz, Short Window
Table B.26 Critical Band: 192 000 Hz, Long Window
Table B.27 Critical Band: 192000 Hz, Short Window
Table B.28 HuffDec1_7x
Table B.29 HuffDec2_64x
Table B.30 HuffDec3_32x
Table B.31 HuffDec4_18x
Table B.32 HuffDec5_18x
Table B.33 HuffDec6_116x
Table B.34 HuffDec7_116x
Table B.35 HuffDec8_16x
Table B.36 HuffDec9_16x
Table B.37 HuffDec10_81x
Table B.38 HuffDec11_25x
Table B.39 HuffDec12_81x
Table B.40 HuffDec13_289x
Table B.41 HuffDec14_31x
Table B.42 HuffDec15_63x
Table B.43 HuffDec16_127x
Table B.44 HuffDec17_255x
Table B.45 HuffDec18_256x
Table B.46 HuffDec19_81x
Table B.47 HuffDec20_25x
Table B.48 HaffDec21_81x
Table B.49 HufFDec22_289x
Table B.50 HuffDec23_31x
Table B.51 HuffDec24_63x
Table B.52 HuffDec25_127x
Table B.53 HuffDec26_255x
Table B.54 HuffDec27_256x
Table C.1 DRA Registration Descriptor
Table C.2 DRA Audio Stream Descriptor