• DocumentCode
    3385673
  • Title

    Auditory speech processing for scale-shift covariance and its evaluation in automatic speech recognition

  • Author

    Patterson, Roy D. ; Walters, Thomas C. ; Monaghan, Jessica ; Feldbauer, Christian ; Irino, Toshio

  • Author_Institution
    Dept. of Physiol., Dev. & Neurosci., Univ. of Cambridge, Cambridge, UK
  • fYear
    2010
  • fDate
    May 30 2010-June 2 2010
  • Firstpage
    3813
  • Lastpage
    3816
  • Abstract
    The syllables of speech contain information about the vocal tract length (VTL) of the speaker as well as the phonetic message. Ideally, the pre-processor used for automatic speech recognition (ASR) should segregate the phonetic message from the VTL information. This paper describes a method to calculate VTL-invariant auditory feature vectors from speech, using a method in which the message and the VTL are segregated. Spectra produced by an auditory filterbank are summarized by a Gaussian mixture model (GMM) to produce a low-dimensional feature vector. These features are evaluated for robustness in comparison with conventional mel-frequency cepstral coefficients (MFCCs) using a hidden-Markov-model (HMM) recognizer. A dynamic, compressive gammachirp (dcGC) auditory filterbank is also introduced. The dcGC provides a level-dependent spectral analysis, with near instantaneous compression, and two-tone suppression.
  • Keywords
    cepstral analysis; covariance analysis; hidden Markov models; speech recognition; Gaussian mixture model; VTL-invariant auditory feature vectors; auditory speech processing; automatic speech recognition; compressive gammachirp auditory filterbank; hidden-Markov-model recognizer; low-dimensional feature vector; mel-frequency cepstral coefficient; phonetic message; scale shift covariance; spectral analysis; two-tone suppression; vocal tract length; Automatic speech recognition; Cepstral analysis; Filter bank; Frequency; Ground penetrating radar; Hidden Markov models; Resonance; Shape; Speech analysis; Speech processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on
  • Conference_Location
    Paris
  • Print_ISBN
    978-1-4244-5308-5
  • Electronic_ISBN
    978-1-4244-5309-2
  • Type

    conf

  • DOI
    10.1109/ISCAS.2010.5537725
  • Filename
    5537725