Access Units


An “Access Unit” is a general concept in audio coding theory used across various codecs.

For optimal audio quality, the input of an audio codec must be linear digital time samples. By this, we mean original “high-quality” time samples (PCM) such as a WAV format or other non-lossless coding format. This is also why cascaded coding (codec after codec) does not deliver acceptable quality.

The audio codec will transform a fixed number of these time samples into an equal group of frequency samples by using a mathematic algorithm (Fourier transform).

The advantage of frequency frames is that they can be compared to the curve of the human ear.  The ear is not equally sensitive to all frequencies, so we can delete redundant information that the human ear will not hear. We call this process perceptual coding (removing frequency information that will not be heard as a result of the human ear mechanism).

Afterwards, we will encode the remaining frequency sample to reduce the number of bits (for instance, reduce the sample rate of lower frequency bands).

Finally, a lossless (Huffman) coding will reduce the bits of the resulting frames a second time.

So, the result is a new output frame with fewer bits, usually up to 1/10 of the original audio input. These frames are transported as bit-compressed audio at a bit rate much less than the original linear time frames at the input.

These frames are Access Units or AU’s.

For example, coded HE-AAC and MP3 audio frames are AU’s. The decoder can decode each AU into linear time samples (audio).

The linear decoder output shall contain less information than the original input to the encoder. However, the difference should be minimal if the encoding/decoding cycle occurs only once in the process and should be considered near input quality.

Lesson learned: avoid cascaded encoding and decoding!

Back