DocumentCode :
1253805
Title :
Automatic generation of subword units for speech recognition systems
Author :
Singh, Rita ; Raj, Bhiksha ; Stern, Richard M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume :
10
Issue :
2
fYear :
2002
fDate :
2/1/2002 12:00:00 AM
Firstpage :
89
Lastpage :
99
Abstract :
Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The performance of the LVCSR system depends critically on the definition of the subword units and the accuracy of the dictionary. In current LVCSR systems, both these components are manually designed. While manually designed subword units generalize well, they may not be the optimal units of classification for the specific task or environment for which an LVCSR system is trained. Moreover, when human expertise is not available, it may not be possible to design good subword units manually. There is clearly a need for data-driven design of these LVCSR components. In this paper, we present a complete probabilistic formulation for the automatic design of subword units and dictionary, given only the acoustic data and their transcriptions. The proposed framework permits easy incorporation of external sources of information, such as the spellings of words in terms of a nonideographic script
Keywords :
maximum likelihood estimation; speech recognition; LVCSR systems; acoustic data; automatic generation; classification; data-driven design; dictionary; large vocabulary continuous speech recognition; mapping table; nonideographic script; probabilistic formulation; speech recognition systems; spellings; subword units; Computer science; Dictionaries; Government; Hidden Markov models; Humans; Information resources; Laboratories; Speech recognition; US Department of Transportation; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.985546
Filename :
985546
Link To Document :
بازگشت