DocumentCode
3531333
Title
Efficient subword lattice retrieval for German spoken term detection
Author
Mertens, Timo ; Schneider, Daniel
Author_Institution
Dept. of Electron. & Telecommun., NTNU, Trondheim
fYear
2009
fDate
19-24 April 2009
Firstpage
4885
Lastpage
4888
Abstract
We present a lattice-based STD method for German broadcast news data and compare it to a previously proposed fuzzy search. Due to the important out-of-vocabulary (OOV) problem in German, we evaluate suitable subword indexing units for lattice retrieval. Hybrid lattice retrieval of words and subwords is investigated because of the robust nature of words as an indexing unit. We show that by using efficient lattice graph and score pruning techniques, precision of subword retrieval is increased by 8% absolute with only a small loss in recall. Additionally, a speed-up of up to 6 times can be observed.
Keywords
fuzzy set theory; graph theory; indexing; information retrieval; natural language processing; speech processing; vocabulary; German spoken term detection; fuzzy search; lattice graph; lattice-based STD method; out-of-vocabulary; score pruning techniques; subword indexing; subword lattice retrieval; Broadcasting; Error analysis; Indexing; Lattices; Morphology; Natural languages; Robustness; Speech recognition; Testing; Vocabulary; speech recognition; speech search; spoken document retrieval; spoken term detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location
Taipei
ISSN
1520-6149
Print_ISBN
978-1-4244-2353-8
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2009.4960726
Filename
4960726
Link To Document