DocumentCode
2800877
Title
Robust spectro-temporal features based on autoregressive models of Hilbert envelopes
Author
Ganapathy, Sriram ; Thomas, Samuel ; Hermansky, Hynek
Author_Institution
Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MN, USA
fYear
2010
fDate
14-19 March 2010
Firstpage
4286
Lastpage
4289
Abstract
In this paper, we present a robust spectro-temporal feature extraction technique using autoregressive models (AR) of sub-band Hilbert envelopes. AR models of Hilbert envelopes are derived using frequency domain linear prediction (FDLP). From the sub-band Hilbert envelopes, spectral features are derived by integrating these envelopes in short-term frames and the temporal features are formed by converting these envelopes into modulation frequency components. The spectral and temporal feature streams are then combined at the phoneme posterior level and are used as the input features for a recognition system. For the proposed features, robustness is achieved by using novel techniques of noise compensation and gain normalization. Phoneme recognition experiments on telephone speech in the HTIMIT database show significant performance improvements for the proposed features when compared to other robust feature techniques (average relative reduction of 10.6 % in phoneme error rate). In addition to the overall phoneme recognition rates, the performance with broad phonetic classes is also reported.
Keywords
Hilbert transforms; autoregressive processes; feature extraction; speech processing; Hilbert envelopes; autoregressive models; frequency domain linear prediction; phoneme recognition; spectro-temporal feature extraction; telephone speech; Error analysis; Feature extraction; Frequency conversion; Frequency domain analysis; Frequency modulation; Noise robustness; Predictive models; Spatial databases; Speech recognition; Telephony; Frequency domain linear prediction (FDLP); Hilbert Envelopes; Phoneme recognition; Robust spectro-temporal features;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495668
Filename
5495668
Link To Document