Feature extraction and acoustic modeling: an approach for improved generalization across languages and accents

Author

Dupont, Stéphane ; Ris, Christophe ; Deroo, Olivier ; Poitoux, Sébastien

Author_Institution

Multitel, Mons

fYear

2005

fDate

27-27 Nov. 2005

Firstpage

29

Lastpage

34

Abstract

The paper proposes a solution that brings some advances to the genericity of the ASR technology towards tasks and languages. A non-linear discriminant model is built from multi-lingual, multi-task speech material in order to classify the acoustic signal into language independent phonetic units. Instead of considering this model for direct HMM state likelihood estimation, it rather operates as a first stage to produce discriminant features that can be further used in cascade with a traditional task/language specific ASR system. This first stage structure is expected to achieve a strong modeling of the cross-language variability of speech that can better handle pronunciation variations due for instance to regional and non-native accents. Moreover, the flexibility of this architecture still allow the development of small task/language dedicated ASR systems as a second stage structure, possibly with small amount of data. The benefit of this architecture is demonstrated through a fine analysis of modeling performance at the phoneme level and on two different isolated word recognition tasks featuring accent variabilities

Keywords

acoustic signal processing; feature extraction; natural languages; speech recognition; acoustic modeling; acoustic signal; automatic speech recognition; feature extraction; generalization across languages; isolated word recognition tasks; language independent phonetic units; nonlinear discriminant model; Acoustic noise; Automatic speech recognition; Context modeling; Feature extraction; Hidden Markov models; Isolation technology; Loudspeakers; Natural languages; State estimation; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on

Conference_Location

San Juan

Print_ISBN

0-7803-9478-X

Electronic_ISBN

0-7803-9479-8

Type

conf

DOI

10.1109/ASRU.2005.1566527

Filename

1566527