DocumentCode
353507
Title
Heterogeneous lexical units for automatic speech recognition: preliminary investigations
Author
Bazzi, Issam ; Glass, James
Author_Institution
Lab. for Comput. Sci., MIT, Cambridge, MA, USA
Volume
3
fYear
2000
fDate
2000
Firstpage
1257
Abstract
This paper explores the use of the phone and syllable as primary units of representation in the first stage of a two-stage recognizer. A finite-state transducer speech recognizer is utilized to configure the recognition as a two-stage process, where either phone or syllable graphs are computed in the first stage, and passed to the second stage to determine the most likely word hypotheses. Preliminary experiments in a weather information speech understanding domain show that a syllable representation with either bigram or trigram language models provides more constraint than a phonetic representation with a higher-order n-gram language model (up to a 6-gram), and approaches the performance of a more conventional single-stage word-based configuration
Keywords
graph theory; speech recognition; automatic speech recognition; bigram language model; finite-state transducer speech recognizer; graphs; heterogeneous lexical units; performance; phone; representation; syllable; trigram language model; two-stage process; two-stage recognizer; weather information speech understanding domain; word hypotheses; Automatic speech recognition; Computer science; Glass; Information systems; Laboratories; Natural languages; Speech processing; Speech recognition; Transducers; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location
Istanbul
ISSN
1520-6149
Print_ISBN
0-7803-6293-4
Type
conf
DOI
10.1109/ICASSP.2000.861804
Filename
861804
Link To Document