مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminative training of HMM stream exponents for audio-visual speech recognition

DocumentCode :

1858210

Title :

Discriminative training of HMM stream exponents for audio-visual speech recognition

Author :

Potamianos, Gerasimos ; Graf, Hans Peter

Author_Institution :

AT&T Labs., Florham Park, NJ, USA

Volume :

fYear :

1998

fDate :

12-15 May 1998

Firstpage :

3733

Abstract :

We propose the use of discriminative training by means of the generalized probabilistic descent (GPB) algorithm to estimate hidden Markov model (HMM) stream exponents for audio-visual speech recognition. Synchronized audio and visual features are used to respectively train audio-only and visual-only single-stream HMMs of identical topology by maximum likelihood. A two-stream HMM is then obtained by combining the two single-stream HMMs and introducing exponents that weigh the log-likelihood of each stream. We present the GPD algorithm for stream exponent estimation, consider a possible initialization, and apply it to the single speaker connected letters task of the AT&T bimodal database. We demonstrate the superior performance of the resulting multi-stream HMM to the audio-only, visual-only, and audio-visual single-stream HMMs

Keywords :

audio-visual systems; feature extraction; hidden Markov models; maximum likelihood estimation; probability; speech recognition; synchronisation; AT&T bimodal database; HMM stream exponents; audio features; audio-only stream; audio-visual speech recognition; discriminative training; generalized probabilistic descent algorithm; hidden Markov model; initialization; log-likelihood; maximum likelihood; single speaker connected letters task; stream exponent estimation; synchronized features; two-stream HMM; visual features; visual-only stream; Automatic speech recognition; Hidden Markov models; Lips; Mutual information; Speech recognition; Streaming media; Testing; Topology; Visual databases; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location :

Seattle, WA

ISSN :

1520-6149

Print_ISBN :

0-7803-4428-6

Type :

conf

DOI :

10.1109/ICASSP.1998.679695

Filename :

679695

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1858210