• DocumentCode
    661503
  • Title

    Robust/fast out-of-vocabulary spoken term detection by N-gram index with exact distance through text/speech input

  • Author

    Sakamoto, Naohisa ; Nakagawa, Sachiko

  • Author_Institution
    Toyohashi Univ. of Technol., Toyohashi, Japan
  • fYear
    2013
  • fDate
    Oct. 29 2013-Nov. 1 2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    For spoken term detection, it is very important to consider Out-of-Vocabulary (OOV). Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a very fast Japanese spoken term detection system that is robust for considering OOV words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram index of syllables in a recognized syllable-based lattice. We proposed an n-gram indexing/retrieval method in the syllable lattice for attacking OOV and high speed retrieval. Specially, in this paper, we redefineded the distance of the n-gram and used trigram, bigram and unigram that instead of using only trigram to calculate the exact distance. In our experiments, where using text and speech query, we achieved to improve the retrieval performance.
  • Keywords
    information retrieval; natural language processing; speech recognition; text analysis; Japanese spoken term detection system; N-gram index; OOV; indexing-retrieval method; robust-fast out-of-vocabulary spoken term detection; speech query; speech recognition; subword unit; text query; text-speech input; Arrays; Hidden Markov models; Indexes; Lattices; Robustness; Speech; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
  • Conference_Location
    Kaohsiung
  • Type

    conf

  • DOI
    10.1109/APSIPA.2013.6694366
  • Filename
    6694366