Error corrective mechanisms for speech recognition

Author

Mangu, Lidia ; Padmanabhan, Mukund

Author_Institution

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume

1

fYear

2001

fDate

2001

Firstpage

29

Abstract

In the standard MAP approach to speech recognition, the goal is to find the word sequence with the highest posterior probability given the acoustic observation. A number of alternate approaches have been proposed for directly optimizing the word error rate, the most commonly used evaluation criterion. One of them, the consensus decoding approach, converts a word lattice into a confusion network which specifies the word-level confusions at different time intervals, and outputs the word with the highest posterior probability from each word confusion set. The paper presents a method for discriminating between the correct and alternate hypotheses in a confusion set using additional knowledge sources extracted from the confusion networks. We use transformation-based learning for inducing a set of rules to guide a better decision between the top two candidates with the highest posterior probabilities in each confusion set. The choice of this learning method is motivated by the perspicuous representation of the rules induced, which can provide insight into the cause of the errors of a speech recognizer. In experiments on the Switchboard corpus, we show significant improvements over the consensus decoding approach

Keywords

error correction; graph theory; hidden Markov models; learning (artificial intelligence); natural languages; probability; set theory; speech recognition; Switchboard corpus; confusion network; consensus decoding approach; error corrective mechanisms; knowledge sources; posterior probability; speech recognition; speech recognizer; transformation-based learning; word confusion set; word error rate; word lattice; word-level confusions; Decision making; Decoding; Error analysis; Error correction; Lattices; Learning systems; Speech recognition; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on

Conference_Location

Salt Lake City, UT

ISSN

1520-6149

Print_ISBN

0-7803-7041-4

Type

conf

DOI

10.1109/ICASSP.2001.940759

Filename

940759