Title :
Automatic generation of phone sets and lexical transcriptions
Author :
Singh, R. ; Raj, B. ; Stern, R.M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Large vocabulary automatic speech recognition systems model words as sequences of a small set of basic sub-word units (the phoneset), which the systems are trained to classify. All words in the system´s vocabulary are transcribed in terms of this set in a dictionary. The phoneset and dictionary are specific to a language and are typically designed manually. The system´s performance is critically dependent on the quality of the phoneset and the accuracy of the dictionary. The authors attempt to generate the phoneset and dictionary automatically, using only the training data and their transcriptions. We treat this as a joint optimization problem with a maximum a posteriori solution for the dictionary and a maximum likelihood solution for the phoneset and its acoustic models. Experiments with the DARPA Resource Management corpus show that the automatically generated phoneset and dictionary result in recognition accuracies close to those obtained using manually designed ones
Keywords :
dictionaries; maximum likelihood estimation; optimisation; speech recognition; word processing; DARPA Resource Management corpus; acoustic models; automatic generation; automatically generated phoneset; basic sub-word units; dictionary; joint optimization problem; large vocabulary automatic speech recognition systems; lexical transcriptions; maximum a posteriori solution; maximum likelihood solution; phone sets; phoneset; recognition accuracies; training data; transcriptions; words; Automatic speech recognition; Computer science; Dictionaries; Equations; Resource management; Speech recognition; System performance; Testing; Training data; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.862076