G.711
G.711 is an ITU-T standard for audio companding, titled Pulse code modulation of voice frequencies, it is a required standard in many technologies, such as in the H.320 and H.323 standards. It was originally designed for use in telephony and was released for use in 1972. It can also be used for fax communication over IP networks. G.711 is a narrowband audio codec that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second, with the tolerance on that rate of 50 parts per million. Non-uniform quantization with 8 bits is used to represent each sample, resulting in a 64 kbit/s bit rate. There are two slightly different versions: μ-law, which is used primarily in North America and Japan, and A-law, which is in use in most other countries outside North America.
Enhancements
Two enhancements to G.711 have been published: G.711.0 utilizes lossless data compression to reduce the bandwidth usage and G.711.1 increases audio quality by increasing bandwidth.Features
- 8 kHz sampling frequency
- 64 kbit/s bitrate
- Typical algorithmic delay is 0.125 ms, with no look-ahead delay
- G.711 is a waveform speech coder
- G.711 Appendix I defines a packet loss concealment algorithm to help hide transmission losses in a packetized network
- G.711 Appendix II defines a discontinuous transmission algorithm which uses voice activity detection and comfort noise generation to reduce bandwidth usage during silence periods
- PSQM testing under ideal conditions yields mean opinion scores of 4.45 for G.711 μ-law, 4.45 for G.711 A-law
- PSQM testing under network stress yields mean opinion scores of 4.13 for G.711 μ-law, 4.11 for G.711 A-law
Types
The μ-law and A-law algorithms encode 14-bit and 13-bit signed linear PCM samples to logarithmic 8-bit samples. Thus, the G.711 encoder will create a 64 kbit/s bitstream for a signal sampled at 8 kHz.
G.711 μ-law tends to give more resolution to higher range signals while G.711 A-law provides more quantization levels at lower signal levels.
The terms PCMU, G711u or G711MU for G711 μ-law, and PCMA or G711A for G711 A-law, are used.
A-law
A-law encoding thus takes a 13-bit signed linear audio sample as input and converts it to an 8 bit value as follows:Linear input code | Compressed code XOR 01010101 | Linear output code |
s0000000abcdx | 000abcd | s0000000abcd1 |
s0000001abcdx | 001abcd | s0000001abcd1 |
s000001abcdxx | 010abcd | s000001abcd10 |
s00001abcdxxx | 011abcd | s00001abcd100 |
s0001abcdxxxx | 100abcd | s0001abcd1000 |
s001abcdxxxxx | 101abcd | s001abcd10000 |
s01abcdxxxxxx | 110abcd | s01abcd100000 |
s1abcdxxxxxxx | 111abcd | s1abcd1000000 |
Where is the sign bit,
is its inverse, and bits marked are discarded. Note that the first column of the table uses different representation of negative values than the third column. So for example, input decimal value −21 is represented in binary after bit inversion as 1000000010100, which maps to 00001010. When decoding, this maps back to 1000000010101, which is interpreted as output value −21 in decimal. Input value +52 maps to 10011010, which maps back to 0000000110101.This can be seen as a floating-point number with 4 bits of mantissa , 3 bits of exponent and 1 sign bit, formatted as
eeemmmm
with the decoded linear value given by formulawhich is a 13-bit signed integer in the range ±1 to ±. Note that no compressed code decodes to zero due to the addition of 0.5.
In addition, the standard specifies that all resulting even bits are inverted before the octet is transmitted. This is to provide plenty of 0/1 transitions to facilitate the clock recovery process in the PCM receivers. Thus, a silent A-law encoded PCM channel has the 8 bit samples coded 0xD5 instead of 0x80 in the octets.
When data is sent over E0, MSB is sent first and LSB is sent last.
ITU-T STL defines the algorithm for decoding as follows.
void alaw_expand
long lseg;
short *linbuf;
short *logbuf;
See also "ITU-T Software Tool Library 2009 User's manual" that can be found at.
μ-law
The μ-law encoding takes a 14-bit signed linear audio sample in two's complement representation as input, inverts all bits after the sign bit if the value is negative, adds 33 and converts it to an 8 bit value as follows:Linear input value | Compressed code XOR 11111111 | Linear output value |
s00000001abcdx | s000abcd | s00000001abcd1 |
s0000001abcdxx | s001abcd | s0000001abcd10 |
s000001abcdxxx | s010abcd | s000001abcd100 |
s00001abcdxxxx | s011abcd | s00001abcd1000 |
s0001abcdxxxxx | s100abcd | s0001abcd10000 |
s001abcdxxxxxx | s101abcd | s001abcd100000 |
s01abcdxxxxxxx | s110abcd | s01abcd1000000 |
s1abcdxxxxxxxx | s111abcd | s1abcd10000000 |
Where is the sign bit, and bits marked are discarded.
In addition, the standard specifies that all result bits are inverted before the octet is transmitted. Thus, a silent μ-law encoded PCM channel has the 8 bit samples coded 0xFF instead of 0x00 in the octets.
Adding 33 is necessary so that all values fall into a compression group and it is subtracted back when decoding.
Breaking the encoded value formatted as
seeemmmm
into 4 bits of mantissa, 3 bits of exponent and 1 sign bit, the decoded linear value is given by formulawhich is a 14-bit signed integer in the range ±0 to ±8031.
Note that 0 is encoded as 0xFF, and −1 is encoded as 0x7F, but when decoded back the result is 0 in both cases.
G.711.0
G.711.0, also known as G.711 LLC, utilizes lossless data compression to reduce the bandwidth usage by as much as 50 percent. The Lossless compression of G.711 pulse code modulation standard was approved by ITU-T in September 2009.G.711.1
G.711.1 is an extension to G.711, published as ITU-T Recommendation G.711.1 in March 2008. Its formal name is Wideband embedded extension for G.711 pulse code modulation.G.711.1, allows the addition of narrowband and/or wideband enhancements, each at 25% of the bitrate of the base G.711 bitstream, leading to data rates of 64, 80 or 96 kbit/s.
G.711.1 is compatible with G.711 at 64 kbit/s, hence an efficient deployment in existing G.711-based voice over IP infrastructures is foreseen. The G.711.1 coder can encode signals at 16 kHz with a bandwidth of 50–7000 Hz at 80 and 96 kbit/s, and for 8-kHz sampling the output may produce signals with a bandwidth ranging from 50 up to 4000 Hz, operating at 64 and 80 kbit/s.
The G.711.1 encoder creates an embedded bitstream structured in three layers corresponding to three available bit rates: 64, 80 and 96 kbit/s. The bitstream does not contain any information on which layers are contained, an implementation would require outband signalling on which layers are available. The three G.711.1 layers are: log companded pulse code modulation of the lower band including noise feedback, embedded PCM extension with adaptive bit allocation for enhancing the quality of the base layer in the lower band, and weighted vector quantization coding of the higher band based on modified discrete cosine transformation.
Two extensions for G.711.1 are planned in 2010: superwideband extension and lossless bitstream compression.