• DocumentCode
    2282655
  • Title

    Feature extraction using non-linear transformation for robust speech recognition on the Aurora database

  • Author

    Sharma, Shantanu ; Ellis, Dan ; Kajarekar, Sachin ; Jain, Pratibha ; Hermansky, Hynek

  • Author_Institution
    Oregon Graduate Inst. of Sci. & Technol., Portland, OR, USA
  • Volume
    2
  • fYear
    2000
  • fDate
    2000
  • Abstract
    We evaluate the performance of several feature sets on the Aurora task as defined by ETSI. We show that after a non-linear transformation, a number of features can be effectively used in a HMM-based recognition system. The non-linear transformation is computed using a neural network which is discriminatively trained on the phonetically labeled (forcibly aligned) training data. A combination of the non-linearly transformed PLP (perceptive linear predictive coefficients), MSG (modulation filtered spectrogram) and TRAP (temporal pattern) features yields a 63% improvement in error rate as compared to baseline me frequency cepstral coefficients features. The use of the non-linearly transformed RASTA-like features, with system parameters scaled down to take into account the ETSI imposed memory and latency constraints, still yields a 40% improvement in error rate
  • Keywords
    error statistics; feature extraction; hidden Markov models; neural nets; prediction theory; spectral analysis; speech recognition; transforms; Aurora database; HMM-based recognition system; MSG features; PLP features; RASTA-like features; TRAP features; error rate; feature extraction; latency constraints; memory constraints; modulation filtered spectrogram; neural network; nonlinear transformation; perceptive linear predictive coefficients; phonetically labeled training data; robust speech recognition; system parameters; temporal pattern; Chirp modulation; Computer networks; Error analysis; Feature extraction; Neural networks; Nonlinear filters; Robustness; Spectrogram; Telecommunication standards; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.859160
  • Filename
    859160