Generalized mixture of HMMs for continuous speech recognition

Author

Korkmazskiy, Filipp ; Juang, Biing-hwang ; Soong, Frank

Author_Institution

Lucent Technol., AT&T Bell Labs., Murray Hill, NJ, USA

Volume

2

fYear

1997

fDate

21-24 Apr 1997

Firstpage

1443

Abstract

This paper presents a new technique for modeling heterogeneous data sources such as speech signals received via distinctly different channels. Such a scenario arises when an automatic speech recognition system is deployed in wireless telephony in which highly heterogeneous channels coexist and interoperate. The problem is that a simple model may become inadequate to describe accurately the diversity of the signal, resulting in an unsatisfactory recognition performance. To deal with such a problem, we propose a generalized mixture model (GMM) approach. For speech signals, in particular, we use mixtures of hidden Markov models (i.e., GMHMM, generalized mixture of HMMs). By applying discriminative training for GMHMM we obtained 1.0% word error rate for the recognition of the digits strings from the wireless database, comparing to 1.4% word error rate for the conventional HMM based discriminative technique

Keywords

hidden Markov models; speech recognition; HMM; automatic speech recognition system; continuous speech recognition; discriminative training; generalized mixture model; heterogeneous data sources; hidden Markov models; highly heterogeneous channels; modeling; signal diversity; speech signals; wireless telephony; word error rate; Automatic speech recognition; Clustering methods; Databases; Error analysis; Hidden Markov models; Linear regression; Robustness; Speech recognition; Telephony; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.596220

Filename

596220