Title :
A Noise-Robust FFT-Based Auditory Spectrum With Application in Audio Classification
Author :
Chu, Wei ; Champagne, Benoît
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC
Abstract :
In this paper, we investigate the noise robustness of Wang and Shamma´s early auditory (EA) model for the calculation of an auditory spectrum in audio classification applications. First, a stochastic analysis is conducted wherein an approximate expression of the auditory spectrum is derived to justify the noise-suppression property of the EA model. Second, we present an efficient fast Fourier transform (FFT)-based implementation for the calculation of a noise-robust auditory spectrum, which allows flexibility in the extraction of audio features. To evaluate the performance of the proposed FFT-based auditory spectrum, a set of speech/music/noise classification tasks is carried out wherein a support vector machine (SVM) algorithm and a decision tree learning algorithm (C4.5) are used as the classifiers. Features used for classification include conventional Mel-frequency cepstral coefficients (MFCCs), MFCC-like features obtained from the original auditory spectrum (i.e., based on the EA model) and the proposed FFT-based auditory spectrum, as well as spectral features (spectral centroid, bandwidth, etc.) computed from the latter. Compared to the conventional MFCC features, both the MFCC-like and spectral features derived from the proposed FFT-based auditory spectrum show more robust performance in noisy test cases. Test results also indicate that, using the new MFCC-like features, the performance of the proposed FFT-based auditory spectrum is slightly better than that of the original auditory spectrum, while its computational complexity is reduced by an order of magnitude.
Keywords :
audio signal processing; cepstral analysis; computational complexity; decision trees; fast Fourier transforms; feature extraction; interference suppression; learning (artificial intelligence); signal classification; support vector machines; Mel-frequency cepstral coefficient; audio classification; audio feature extraction; computational complexity; decision tree learning algorithm; noise-robust fast Fourier transform-based auditory spectrum; noise-suppression property; stochastic analysis; support vector machine algorithm; Audio classification; C4.5; early auditory (EA) model; noise suppression; self-normalization; support vector machine (SVM);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2007.907569