Title :
Voice search of structured media data
Author :
Song, Young-In ; Wang, Ye-Yi ; Ju, Yun-Cheng ; Seltzer, Mike ; Tashev, Ivan ; Acero, Alex
Author_Institution :
Korea Univ., Seoul
Abstract :
This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-to-end search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that, an HMM sequential rescoring model has reduced the error rate by 28% on text queries and up to 23% on spoken queries compared to the baseline system. Furthermore, a phonetic similarity model has been introduced to compensate speech recognition errors, which has improved the end-to-end search accuracy consistently across different levels of speech recognition accuracy.
Keywords :
hidden Markov models; meta data; query processing; speech recognition; HMM sequential rescoring model; hidden Markov model; music metadata; phonetic similarity model; speech recognition error compensation; spoken query; structured media database; text query; unstructured query; voice search application; Arm; Computer errors; Costs; Databases; Error analysis; Hidden Markov models; Motion pictures; Music information retrieval; Natural languages; Speech recognition; HMMs; language model based information retrieval; music metadata; phonetic confusability; spoken language understanding; voice search;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960490