DocumentCode :
661503
Title :
Robust/fast out-of-vocabulary spoken term detection by N-gram index with exact distance through text/speech input
Author :
Sakamoto, Naohisa ; Nakagawa, Sachiko
Author_Institution :
Toyohashi Univ. of Technol., Toyohashi, Japan
fYear :
2013
fDate :
Oct. 29 2013-Nov. 1 2013
Firstpage :
1
Lastpage :
4
Abstract :
For spoken term detection, it is very important to consider Out-of-Vocabulary (OOV). Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a very fast Japanese spoken term detection system that is robust for considering OOV words. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram index of syllables in a recognized syllable-based lattice. We proposed an n-gram indexing/retrieval method in the syllable lattice for attacking OOV and high speed retrieval. Specially, in this paper, we redefineded the distance of the n-gram and used trigram, bigram and unigram that instead of using only trigram to calculate the exact distance. In our experiments, where using text and speech query, we achieved to improve the retrieval performance.
Keywords :
information retrieval; natural language processing; speech recognition; text analysis; Japanese spoken term detection system; N-gram index; OOV; indexing-retrieval method; robust-fast out-of-vocabulary spoken term detection; speech query; speech recognition; subword unit; text query; text-speech input; Arrays; Hidden Markov models; Indexes; Lattices; Robustness; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location :
Kaohsiung
Type :
conf
DOI :
10.1109/APSIPA.2013.6694366
Filename :
6694366
Link To Document :
بازگشت