DocumentCode
2021087
Title
Predicting unseen triphones with senones
Author
Hwang, Mei-Yuh ; Huang, Xuedong ; Alleva, Fileno
Author_Institution
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume
2
fYear
1993
fDate
27-30 April 1993
Firstpage
311
Abstract
In large-vocabulary speech recognition, there are always new triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context-independent monophones. It is proposed that decision-tree-based senones be used to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of each phone, and the leaves of the trees constitute the senone codebook. A Markov state of any triphone traverses the corresponding tree until it reaches a leaf to find the senone it is to be associated with. The DARPA 5000-word peaker-independent Wall Street Journal dictation task is used to evaluate the proposed method. The word error rate is reduced by more than 10% when unseen triphones are modeled by the decision-tree-based senones.<>
Keywords
Markov processes; dictation; learning (artificial intelligence); speech recognition; trees (mathematics); vocabulary; Markov state; Wall Street Journal dictation; decision-tree-based senones; large-vocabulary speech recognition; senone codebook; training; unseen triphones; word error rate;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location
Minneapolis, MN, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.1993.319299
Filename
319299
Link To Document