Title :
Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring
Author :
Kim, Chanwoo ; Stern, Richard M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
This paper presents a new robust feature extraction algorithm based on a modified approach to power bias subtraction combined with applying a threshold to the power spectral density. Power bias level is selected as a level above which the signal power distribution is sharpest. The sharpness is measured using the ratio of arithmetic mean to the geometric mean of medium-duration power. When subtracting this bias level, power flooring is applied to enhance robustness. These new ideas are employed to enhance our recently introduced feature extraction algorithm PNCC (Power Normalized Cepstral Coefficient). While simpler than our previous PNCC, experimental results show that this new PNCC is showing better performance than our previous implementation.
Keywords :
cepstral analysis; feature extraction; speech recognition; PNCC; arithmetic mean; feature extraction; geometric mean; medium-duration power; power bias subtraction; power distribution sharpness; power flooring; power normalized cepstral coefficient; power spectral density; signal power distribution; speech recognition; Arithmetic; Feature extraction; Hidden Markov models; Natural languages; Power distribution; Power measurement; Power system modeling; Robustness; Speech recognition; Working environment noise; Robust speech recognition; auditory threshold; physiological modeling; power flooring; sharpness of power distribution;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495570