Title :
The impact of speech recognition errors on the effectiveness of spoken Cantonese query retrieval
Author :
Choi, T.K. ; Zhu, X.M. ; Luk, R.W.P. ; Chung, F.L. ; Mak, M.W. ; Lam, M.M. ; Siu, W.C.
Author_Institution :
Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Kowloon, China
Abstract :
This paper examines the impact of recognition errors on spoken Cantonese query retrieval effectiveness. One of the largest test collections provided by NTCIR for evaluating Chinese information retrieval is used. The retrieval system uses one of the best models (2-Poisson) and the robust bigram indexing strategy. If there are no syllable recognition errors, then the errors in converting spelling (called pinyin) to characters degrades the performance by 3.9% points which is not statistically significant. Otherwise, the performance dropped by 10.2% points which is statistically significant. We improved our system by merging the /n/ and /l/ phone labels and retrained the syllable-to-text conversion routines. The improved retrieval system dropped only 6.4% points.
Keywords :
Poisson distribution; indexing; information retrieval; natural language interfaces; speech processing; speech recognition; 2-Poisson model; Chinese information retrieval; NTCIR; phone labels; pinyin; query retrieval; robust bigram indexing strategy; speech recognition errors; spoken Cantonese; syllable recognition errors; syllable-to-text conversion routines; Character recognition; Degradation; Indexing; Information retrieval; Merging; Robustness; Signal processing; Speech recognition; Testing; Writing;
Conference_Titel :
Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
Print_ISBN :
0-7803-8687-6
DOI :
10.1109/ISIMP.2004.1434037