Title :
On estimation of two-to-one selection-based intelligibility score using speech recognition
Author :
Takano, Yusuke ; Kondo, Kazuhiro
Author_Institution :
Grad. Sch. of Sci. & Eng., Yamagata Univ., Yonezawa, Japan
Abstract :
In this research, we investigated on an estimation method for subjective Japanese speech intelligibility using conventional speech recognition systems. We attempted to estimate intelligibility scores of the Japanese diagnostic rhyme test (DRT), a two-to-one selection-based intelligibility test. The forced selection process was simulated with a language model that forces one of the words in the word pair in the speech recognizer. DRT words were mixed with Gaussian noise, babble (multi-speaker) noise, and pseudo-speech noise at various SNRs. The recognition ratio was compared with subjective intelligibility scores. The recognition rate of clean speech was low overall when Japanese version DRT is imitated by using the speaker-independent phoneme model. However, the recognition rate was improved by 20% by using the speaker-adapted model. The rate of deterioration from clean speech when using the speaker-adapted model was more similar to the subjective evaluation results compared to results using the speaker-independent model. However, the recognition performance is still insufficient compared to the subjectivity evaluation results. We are currently working on improvements to noise tolerance using noise adaptation. We believe this should further improve the recognition rates, bringing the overall accuracy even closer to the subjective results.
Keywords :
Gaussian noise; natural language processing; speech intelligibility; speech recognition; DRT words; Gaussian noise; Japanese diagnostic rhyme test; Japanese speech intelligibility; babble noise; intelligibility scores estimation; language model; multispeaker noise; noise adaptation; noise tolerance; pseudospeech noise; recognition ratio; speaker-adapted model; speaker-independent phoneme; speech recognition; speech recognizer; two-to-one selection-based intelligibility test; word pair; Consumer electronics; Gaussian noise; Humans; Mobile communication; Natural languages; Speech analysis; Speech enhancement; Speech processing; Speech recognition; Testing; Adaptation; Japanes DRT; Objective estimation; Speech intelligibility; Speech recognition; component;
Conference_Titel :
Consumer Electronics, 2009. ISCE '09. IEEE 13th International Symposium on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-2975-2
Electronic_ISBN :
978-1-4244-2976-9
DOI :
10.1109/ISCE.2009.5156846