DocumentCode :
672389
Title :
Using proxies for OOV keywords in the keyword search task
Author :
Guoguo Chen ; Yilmaz, Ozgur ; Trmal, Jan ; Povey, Daniel ; Khudanpur, Sanjeev
Author_Institution :
Center for Language & Speech Process. & Human Language Technol. Center of Excellence, Johns Hopkins Univ., Baltimore, MD, USA
fYear :
2013
fDate :
8-12 Dec. 2013
Firstpage :
416
Lastpage :
421
Abstract :
We propose a simple but effective weighted finite state transducer (WFST) based framework for handling out-of-vocabulary (OOV) keywords in a speech search task. State-of-the-art large vocabulary continuous speech recognition (LVCSR) and keyword search (KWS) systems are developed for conversational telephone speech in Tagalog. Word-based and phone-based indexes are created from word lattices, the latter by using the LVCSR system´s pronunciation lexicon. Pronunciations of OOV keywords are hypothesized via a standard grapheme-to-phoneme method. In-vocabulary proxies (word or phone sequences) are generated for each OOV keyword using WFST techniques that permit incorporation of a phone confusion matrix. Empirical results when searching for the Babel/NIST evaluation keywords in the Babel 10 hour development-test speech collection show that (i) searching for word proxies in the word index significantly outperforms searching for phonetic representations of OOV words in a phone index, and (ii) while phone confusion information yields minor improvement when searching a phone index, it yields up to 40% improvement in actual term weighted value when searching a word index with word proxies.
Keywords :
document handling; query processing; speech recognition; vocabulary; Babel evaluation keyword; KWS system; LVCSR system pronunciation lexicon; NIST evaluation keyword; OOV keyword pronunciation; OOV keyword search task; Tagalog; WFST-based framework; conversational telephone speech; empirical analysis; in-vocabulary proxies; large-vocabulary continuous speech recognition; out-of-vocabulary keyword handling; phone confusion matrix; phone sequences; phone-based index; speech search task; standard grapheme-to-phoneme method; term weighted value; weighted finite state transducer; word lattices; word proxy search; word sequences; word-based index; Indexes; Keyword search; Lattices; Speech; Training; Transducers; Vocabulary; Keyword Search; Low Resource LVCSR; OOV Keywords; Proxy Keywords; Speech Recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location :
Olomouc
Type :
conf
DOI :
10.1109/ASRU.2013.6707766
Filename :
6707766
Link To Document :
بازگشت