Decoding optimal state sequence with smooth state likelihoods

Author

Zeljkovic, Ilija

Author_Institution

AT&T Bell Labs., Murray Hill, NJ, USA

Volume

1

fYear

1996

fDate

7-10 May 1996

Firstpage

129

Abstract

A novel algorithm that allows the decoding of hidden Markov model (HMM) state sequences while constraining the state likelihoods to be more uniform is presented. In HMM-based speech recognizers, the decoded optimal state sequence is restricted by the HMM topology and the grammar. Thus, the most likely state sequence derived by the Viterbi algorithm can be influenced by a few states with very high likelihoods-often resulting in recognition errors. This paper presents a method for decoding state sequences with less volatile state probabilities by introducing penalties proportional to the difference of the current state likelihood and the highest state likelihood for the particular time frame. These penalties are added to the cumulative likelihoods in the Viterbi forward path at every time frame. This technique, referred to as the smooth state likelihood decoding algorithm (SSLDA), reduced recognition error-rates substantially on connected digit tests performed on two speech databases derived from field trials. The error rate was reduced by more than 40% on the one database and more than 60% on the other field trial database for variable length digit strings

Keywords

Viterbi decoding; error statistics; grammars; hidden Markov models; optimisation; probability; sequences; smoothing methods; speech recognition; HMM based speech recognizers; HMM topology; Viterbi algorithm; Viterbi forward path; connected digit tests; cumulative likelihoods; field trial database; grammar; hidden Markov model; optimal state sequence decoding; recognition error rates reduction; recognition errors; smooth state likelihood decoding algorithm; speech databases; state probabilities; variable length digit strings; Databases; Decoding; Error analysis; Hidden Markov models; Inspection; Performance evaluation; Speech recognition; Testing; Topology; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on

Conference_Location

Atlanta, GA

ISSN

1520-6149

Print_ISBN

0-7803-3192-3

Type

conf

DOI

10.1109/ICASSP.1996.540307

Filename

540307