Title :
Automatic lexical pronunciations generation and update
Author :
Choueiter, Ghinwa F. ; Seneff, Stephanie ; Glass, James R.
Author_Institution :
MIT Comput. Sci. & Artificial Intelligence Lab., Cambridge
Abstract :
Most automatic speech recognizers use a dictionary that maps words to one or more canonical pronunciations. Such entries are typically hand-written by lexical experts. In this research, we investigate a new approach for automatically generating lexical pronunciations using a linguistically motivated subword model, and refining the pronunciations with spoken examples. The approach is evaluated on an isolated word recognition task with a 2 k lexicon of restaurant and street names. A letter-to-sound model is first used to generate seed baseforms for the lexicon. Then spoken utterances of words in the lexicon are presented to a subword recognizer and the top hypotheses are used to update the lexical base-forms. The spelling of each word is also used to constrain the subword search space and generate spelling-constrained baseforms. The results obtained are quite encouraging and indicate that our approach can be successfully used to learn valid pronunciations of new words.
Keywords :
linguistics; speech recognition; speech synthesis; automatic lexical pronunciation generation; automatic speech recognizer; letter-to-sound model; lexical pronunciation update; linguistically motivated subword model; spelling-constrained baseform; word pronunciation; Artificial intelligence; Automatic speech recognition; Broadcasting; Computer science; Decision trees; Decoding; Dictionaries; Glass; Laboratories; Vocabulary; Letter-to-sound model; lexical pronunciations;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
DOI :
10.1109/ASRU.2007.4430113