• DocumentCode
    2875886
  • Title

    Feature extraction and acoustic modeling: an approach for improved generalization across languages and accents

  • Author

    Dupont, Stéphane ; Ris, Christophe ; Deroo, Olivier ; Poitoux, Sébastien

  • Author_Institution
    Multitel, Mons
  • fYear
    2005
  • fDate
    27-27 Nov. 2005
  • Firstpage
    29
  • Lastpage
    34
  • Abstract
    The paper proposes a solution that brings some advances to the genericity of the ASR technology towards tasks and languages. A non-linear discriminant model is built from multi-lingual, multi-task speech material in order to classify the acoustic signal into language independent phonetic units. Instead of considering this model for direct HMM state likelihood estimation, it rather operates as a first stage to produce discriminant features that can be further used in cascade with a traditional task/language specific ASR system. This first stage structure is expected to achieve a strong modeling of the cross-language variability of speech that can better handle pronunciation variations due for instance to regional and non-native accents. Moreover, the flexibility of this architecture still allow the development of small task/language dedicated ASR systems as a second stage structure, possibly with small amount of data. The benefit of this architecture is demonstrated through a fine analysis of modeling performance at the phoneme level and on two different isolated word recognition tasks featuring accent variabilities
  • Keywords
    acoustic signal processing; feature extraction; natural languages; speech recognition; acoustic modeling; acoustic signal; automatic speech recognition; feature extraction; generalization across languages; isolated word recognition tasks; language independent phonetic units; nonlinear discriminant model; Acoustic noise; Automatic speech recognition; Context modeling; Feature extraction; Hidden Markov models; Isolation technology; Loudspeakers; Natural languages; State estimation; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on
  • Conference_Location
    San Juan
  • Print_ISBN
    0-7803-9478-X
  • Electronic_ISBN
    0-7803-9479-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2005.1566527
  • Filename
    1566527