مرکز منطقه ای اطلاع رساني علوم و فناوري - Baum-Welch training for segment-based speech recognition

DocumentCode :

3244142

Title :

Baum-Welch training for segment-based speech recognition

Author :

Shu, Han ; Hetherington, I. Lee ; Glass, James

Author_Institution :

Comput. Sci. & Artificial Intelligence Lab., MIT, Cambridge, MA, USA

fYear :

2003

fDate :

30 Nov.-3 Dec. 2003

Firstpage :

Lastpage :

Abstract :

The use of segment-based features and segmentation networks in a segment-based speech recognizer complicates the probabilistic modeling because it alters the sample space of all possible segmentation paths and the feature observation space. This paper describes a novel Baum-Welch training algorithm for segment-based speech recognition which addresses these issues by an innovative use of finite-state transducers. This procedure has the desirable property of not requiring initial seed models that were needed by the Viterbi training procedure we have used previously. On the PhoneBook telephone-based corpus of read isolated words, the Baum-Welch training algorithm obtained a relative error reduction of 37 % on the training set and a relative error reduction of 5 % on the test set, compared to Viterbi trained models. When combined with a duration model, and more flexible segmentation network, the Baum-Welch trained models obtain an overall word error rate of 7.6 %, which is the best result we have seen published for the 8000 word task.

Keywords :

error statistics; feature extraction; finite state machines; probability; speech recognition; Baum-Welch training; PhoneBook telephone-based corpus; finite-state transducers; probabilistic modeling; read isolated words; segment-based features; segment-based speech recognition; segmentation networks; speech recognizer; word error rate; Artificial intelligence; Automatic speech recognition; Computer science; Glass; Hidden Markov models; Laboratories; Space technology; Speech recognition; Transducers; Viterbi algorithm;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN :

0-7803-7980-2

Type :

conf

DOI :

10.1109/ASRU.2003.1318401

Filename :

1318401

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3244142