• DocumentCode
    3531333
  • Title

    Efficient subword lattice retrieval for German spoken term detection

  • Author

    Mertens, Timo ; Schneider, Daniel

  • Author_Institution
    Dept. of Electron. & Telecommun., NTNU, Trondheim
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4885
  • Lastpage
    4888
  • Abstract
    We present a lattice-based STD method for German broadcast news data and compare it to a previously proposed fuzzy search. Due to the important out-of-vocabulary (OOV) problem in German, we evaluate suitable subword indexing units for lattice retrieval. Hybrid lattice retrieval of words and subwords is investigated because of the robust nature of words as an indexing unit. We show that by using efficient lattice graph and score pruning techniques, precision of subword retrieval is increased by 8% absolute with only a small loss in recall. Additionally, a speed-up of up to 6 times can be observed.
  • Keywords
    fuzzy set theory; graph theory; indexing; information retrieval; natural language processing; speech processing; vocabulary; German spoken term detection; fuzzy search; lattice graph; lattice-based STD method; out-of-vocabulary; score pruning techniques; subword indexing; subword lattice retrieval; Broadcasting; Error analysis; Indexing; Lattices; Morphology; Natural languages; Robustness; Speech recognition; Testing; Vocabulary; speech recognition; speech search; spoken document retrieval; spoken term detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960726
  • Filename
    4960726