DocumentCode :
2290700
Title :
Spoken Term Detection Using Visual Spectrogram Matching
Author :
Lazic, Nevena ; Aarabi, Parham
Author_Institution :
Edward S. Rogers Sr. Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON
fYear :
2008
fDate :
15-17 Dec. 2008
Firstpage :
637
Lastpage :
642
Abstract :
This work proposes a novel spoken term detection technique, where the query is in audio format. Detection and retrieval are performed by matching the spectrograms of the spoken document and query as visual images, using ideas from computer vision. Local descriptors are computed on a dense grid over each spectrogram, and the query term is detected using deformable template matching of grids. Detection experiments are performed on an hour-long newscast recording, involving 10 query terms of length 2-3 words. When the query term comes from the document, nearly all other instances of the term in the document are detected; performance degrades when the query is recorded by the user.
Keywords :
document handling; query processing; computer vision; newscast recording; spoken document; spoken term detection; visual images; visual spectrogram matching; Audio recording; Automatic speech recognition; Computer vision; Frequency; Image retrieval; Indexing; Music information retrieval; NIST; Spectrogram; Vocabulary; spectrograms; spoken document retrieval; spoken term detection; template matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia, 2008. ISM 2008. Tenth IEEE International Symposium on
Conference_Location :
Berkeley, CA
Print_ISBN :
978-0-7695-3454-1
Electronic_ISBN :
978-0-7695-3454-1
Type :
conf
DOI :
10.1109/ISM.2008.28
Filename :
4741240
Link To Document :
بازگشت