DocumentCode
661503
Title
Robust/fast out-of-vocabulary spoken term detection by N-gram index with exact distance through text/speech input
Author
Sakamoto, Naohisa ; Nakagawa, Sachiko
Author_Institution
Toyohashi Univ. of Technol., Toyohashi, Japan
fYear
2013
fDate
Oct. 29 2013-Nov. 1 2013
Firstpage
1
Lastpage
4
Abstract
For spoken term detection, it is very important to consider Out-of-Vocabulary (OOV). Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a very fast Japanese spoken term detection system that is robust for considering OOV words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram index of syllables in a recognized syllable-based lattice. We proposed an n-gram indexing/retrieval method in the syllable lattice for attacking OOV and high speed retrieval. Specially, in this paper, we redefineded the distance of the n-gram and used trigram, bigram and unigram that instead of using only trigram to calculate the exact distance. In our experiments, where using text and speech query, we achieved to improve the retrieval performance.
Keywords
information retrieval; natural language processing; speech recognition; text analysis; Japanese spoken term detection system; N-gram index; OOV; indexing-retrieval method; robust-fast out-of-vocabulary spoken term detection; speech query; speech recognition; subword unit; text query; text-speech input; Arrays; Hidden Markov models; Indexes; Lattices; Robustness; Speech; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location
Kaohsiung
Type
conf
DOI
10.1109/APSIPA.2013.6694366
Filename
6694366
Link To Document