• DocumentCode
    2875000
  • Title

    Vocal tract length invariant features for automatic speech recognition

  • Author

    Mertins, Alfred ; Rademacher, Jan

  • Author_Institution
    Inst. of Phys., Oldenburg Univ.
  • fYear
    2005
  • fDate
    27-27 Nov. 2005
  • Firstpage
    308
  • Lastpage
    312
  • Abstract
    The effects of vocal tract length (VTL) variation are often approximated by linear frequency warping of short-time spectra. Based on this relationship, we present a method for generating vocal tract length invariant features. These new features are computed as translation invariant, correlation-type features in a log-frequency domain. In phoneme classification experiments, their discrimination capabilities turned out to be considerably better than for Mel-frequency cepstral coefficients (MFCCs). The best results are obtained when VTL-invariant (VTLI) features and MFCCs are combined. The superiority of the combined feature set and its resilience to VTL variations is also shown for word recognition, using the TIDIGITS corpus and the HTK recognizer
  • Keywords
    correlation theory; feature extraction; speech recognition; automatic speech recognition; correlation-type features; linear frequency warping; log-frequency domain; phoneme classification; vocal tract length invariant features; Automatic speech recognition; Cepstral analysis; Frequency; Hidden Markov models; Physics; Robustness; Signal processing; Signal resolution; Testing; Wavelet transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on
  • Conference_Location
    San Juan
  • Print_ISBN
    0-7803-9478-X
  • Electronic_ISBN
    0-7803-9479-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2005.1566473
  • Filename
    1566473