A Psychoacoustic Model for Audio Coding Based on a Cochlear Filter Bank

Frank Baumgarte

To appear at Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA01), Mohonk Mountain Resort, NY, 21-24 October 2001


Abstract

Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mimicking the psychoacoustics of masking. Current applications use a uniform spectral decomposition as first stage of that model to approximate the frequency selectivity of the human auditory system. The availability of efficient implementations led to a virtually exclusive use of uniform decompositions in audio coding. However, the equal filter properties of the uniform sub-bands are not in line with the nonuniform auditory filters. This paper presents a psychoacoustic model based on an efficient nonuniform cochlear filter bank with a simplified less complex post-processing for estimating the masked threshold. Application results in audio coding show a significantly better performance in terms of bit rate and/or quality of the new model in comparison with other state-of-the-art models with a uniform spectral decomposition.


Server START Conference Manager
Update Time 5 Jul 2001 at 16:08:04
Maintainer malcolm@ieee.org.
Start Conference Manager
Conference Systems