مرکز منطقه ای اطلاع رساني علوم و فناوري - Segmental minimum Bayes-risk decoding for automatic speech recognition

DocumentCode :

959805

Title :

Segmental minimum Bayes-risk decoding for automatic speech recognition

Author :

Goel, Vaibhava ; Kumar, Shankar ; Byrne, William

Author_Institution :

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

Volume :

Issue :

fYear :

2004

fDate :

5/1/2004 12:00:00 AM

Firstpage :

234

Lastpage :

249

Abstract :

Minimum Bayes-risk (MBR) speech recognizers have been shown to yield improvements over the conventional maximum a-posteriori probability (MAP) decoders through N-best list rescoring and A^* search over word lattices. We present a segmental minimum Bayes-risk decoding (SMBR) framework that simplifies the implementation of MBR recognizers through the segmentation of the N-best lists or lattices over which the recognition is to be performed. This paper presents lattice cutting procedures that underly SMBR decoding. Two of these procedures are based on a risk minimization criterion while a third one is guided by word-level confidence scores. In conjunction with SMBR decoding, these lattice segmentation procedures give consistent improvements in recognition word error rate (WER) on the Switchboard corpus. We also discuss an application of risk-based lattice cutting to multiple-system SMBR decoding and show that it is related to other system combination techniques such as ROVER. This strategy combines lattices produced from multiple ASR systems and is found to give WER improvements in a Switchboard evaluation system.

Keywords :

Bayes methods; error statistics; maximum likelihood decoding; maximum likelihood estimation; minimisation; speech coding; speech recognition; ASR system combination; N-best lists; acoustic data segmentation; automatic speech recognition; extended-ROVER; lattice cutting procedures; lattice segmentation procedures; maximum a-posteriori probability; risk minimization criterion; segmental-minimum Bayes-risk decoding; switchboard corpus; utterance level; word error rate; word lattices; word-level confidence scores; Automatic speech recognition; Decoding; Equations; Error analysis; Hidden Markov models; Lattices; Loss measurement; Maximum a posteriori estimation; Natural languages; Risk management;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2004.825678

Filename :

1288151

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=959805