Title :
Query-by-example Spoken Term Detection For OOV terms
Author :
Parada, Carolina ; Sethy, Abhinav ; Ramabhadran, Bhuvana
Author_Institution :
Center for Language & Speech Process., Johns Hopkins Univ., Baltimore, MD, USA
fDate :
Nov. 13 2009-Dec. 17 2009
Abstract :
The goal of spoken term detection (STD) technology is to allow open vocabulary search over large collections of speech content. In this paper, we address cases where search term(s) of interest (queries) are acoustic examples. This is provided either by identifying a region of interest in a speech stream or by speaking the query term. Queries often relate to named-entities and foreign words, which typically have poor coverage in the vocabulary of large vocabulary continuous speech recognition (LVCSR) systems. Throughout this paper, we focus on query-by-example search for such out-of-vocabulary (OOV) query terms. We build upon a finite state transducer (FST) based search and indexing system to address the query by example search for OOV terms by representing both the query and the index as phonetic lattices from the output of an LVCSR system. We provide results comparing different representations and generation mechanisms for both queries and indexes built with word and combined word and subword units. We also present a two-pass method which uses query-by-example search using the best hit identified in an initial pass to augment the STD search results. The results demonstrate that query-by-example search can yield a significantly better performance, measured using actual term-weighted value (ATWV), of 0.479 when compared to a baseline ATWV of 0.325 that uses reference pronunciations for OOVs. Further improvements can be obtained with the proposed two pass approach and filtering using the expected unigram counts from the LVCSR system´s lexicon.
Keywords :
indexing; query processing; speech processing; speech recognition; vocabulary; OOV terms; actual term-weighted value; finite state transducer; foreign words; indexing system; large vocabulary continuous speech recognition systems; named-entities; open vocabulary search; out-of-vocabulary query terms; phonetic lattices; query-by-example search; query-by-example spoken term detection; reference pronunciations; speech content; speech stream; spoken term detection technology; Acoustic signal detection; Indexing; Information retrieval; Lattices; Music information retrieval; Natural languages; Speech processing; Speech recognition; Transducers; Vocabulary;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
DOI :
10.1109/ASRU.2009.5373341