• DocumentCode
    1843640
  • Title

    Modeling auditory perception to improve robust speech recognition

  • Author

    Strope, Brian ; Alwan, Abeer

  • Author_Institution
    Dept. of Electr. Eng., California Univ., Los Angeles, CA, USA
  • Volume
    2
  • fYear
    1997
  • fDate
    2-5 Nov. 1997
  • Firstpage
    1056
  • Abstract
    While non-stationary stochastic techniques have led to substantial improvements in vocabulary size and speaker independence, most automatic speech recognition (ASR) systems remain overly sensitive to the acoustic environment, precluding robust widespread applications. Our approach to this problem has been to model fundamental aspects of auditory perception, which are typically neglected in common ASR front ends, to derive a more robust and phonetically relevant parameterization of speech. Short-term adaptation and recovery, a sensitivity to local spectral peaks, together with an explicit parameterization of the position and motion of local spectral peaks reduces the error rate of a word recognition task by as much as a factor of 4. Current work also investigates the perceptual significance of pitch-rate amplitude-modulation cues in noise.
  • Keywords
    amplitude modulation; hearing; spectral analysis; speech recognition; ASR front ends; acoustic environment; auditory perception modelling; automatic speech recognition; error rate reduction; local spectral peaks sensitivity; noise; phonetically relevant parameterization; pitch-rate amplitude-modulation cues; robust speech recognition; short-term adaptation; short-term recovery; word recognition task; Automatic speech recognition; Discrete cosine transforms; Filters; Frequency estimation; Hidden Markov models; Robustness; Signal processing; Spectrogram; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signals, Systems & Computers, 1997. Conference Record of the Thirty-First Asilomar Conference on
  • Conference_Location
    Pacific Grove, CA, USA
  • ISSN
    1058-6393
  • Print_ISBN
    0-8186-8316-3
  • Type

    conf

  • DOI
    10.1109/ACSSC.1997.679067
  • Filename
    679067