Gaussian mixture models of phonetic boundaries for speech recognition

Author

Omar, Mohamed K. ; Hasegawa-Johnson, Mark ; Levinson, Stephen

Author_Institution

Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL, USA

fYear

2001

fDate

2001

Firstpage

33

Lastpage

36

Abstract

A new approach to represent temporal correlation in an automatic speech recognition system is described. It introduces an acoustic feature set that captures the dynamics of a speech signal at the phoneme boundaries in combination with the traditional acoustic feature set representing the periods that are assumed to be quasi-stationary of speech. This newly introduced feature set represents an observed random vector associated with the state transition in HMM. For the same complexity and number of parameters, this approach improves the phoneme recognition accuracy by 3.5% compared to the context-independent HMM models. Stop consonant recognition accuracy is increased by 40%.

Keywords

Gaussian processes; acoustic signal processing; computational complexity; correlation methods; hidden Markov models; speech processing; speech recognition; Gaussian mixture models; HMM; acoustic feature set; automatic speech recognition; phoneme recognition; phonetic boundaries; speech signal; state transition; stop consonant recognition; Acoustic measurements; Automatic speech recognition; Context modeling; Decoding; Density measurement; Hidden Markov models; Humans; Probability density function; Solid modeling; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on

Print_ISBN

0-7803-7343-X

Type

conf

DOI

10.1109/ASRU.2001.1034582

Filename

1034582