• DocumentCode
    2800877
  • Title

    Robust spectro-temporal features based on autoregressive models of Hilbert envelopes

  • Author

    Ganapathy, Sriram ; Thomas, Samuel ; Hermansky, Hynek

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MN, USA
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4286
  • Lastpage
    4289
  • Abstract
    In this paper, we present a robust spectro-temporal feature extraction technique using autoregressive models (AR) of sub-band Hilbert envelopes. AR models of Hilbert envelopes are derived using frequency domain linear prediction (FDLP). From the sub-band Hilbert envelopes, spectral features are derived by integrating these envelopes in short-term frames and the temporal features are formed by converting these envelopes into modulation frequency components. The spectral and temporal feature streams are then combined at the phoneme posterior level and are used as the input features for a recognition system. For the proposed features, robustness is achieved by using novel techniques of noise compensation and gain normalization. Phoneme recognition experiments on telephone speech in the HTIMIT database show significant performance improvements for the proposed features when compared to other robust feature techniques (average relative reduction of 10.6 % in phoneme error rate). In addition to the overall phoneme recognition rates, the performance with broad phonetic classes is also reported.
  • Keywords
    Hilbert transforms; autoregressive processes; feature extraction; speech processing; Hilbert envelopes; autoregressive models; frequency domain linear prediction; phoneme recognition; spectro-temporal feature extraction; telephone speech; Error analysis; Feature extraction; Frequency conversion; Frequency domain analysis; Frequency modulation; Noise robustness; Predictive models; Spatial databases; Speech recognition; Telephony; Frequency domain linear prediction (FDLP); Hilbert Envelopes; Phoneme recognition; Robust spectro-temporal features;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495668
  • Filename
    5495668