DocumentCode
2645250
Title
Study of Relationships between Intra-speaker´s Speech Variability and Speech Recognition Performance
Author
Tsuge, Satoru ; Fukumi, Minoru ; Shishibori, Masami ; Ren, Fuji ; Kita, Kenji ; Kuroiwa, Shingo
Author_Institution
Tokushima Univ.
fYear
2006
fDate
12-15 Dec. 2006
Firstpage
41
Lastpage
44
Abstract
Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. For this reason, speech quality is varied by some factors, including emotion, background noise, and so on, even though the speaker and utterance remain constant. However, the relationships between intra-speaker´s speech variability and speech recognition performance are not clear. Hence, we focus on the intra-speaker´s speech variability which affects the speech recognition performances. To investigate these relationships, we have been collecting speech data since November 2002. Using a part of the speech corpus, we conducted speech recognition experiments. In this paper, we analyze the relationships between intra-speaker´s speech variability and the phoneme accuracy by using the correlation analysis. For factors of the correlation analysis, we use a number of errors, a speaking rate, a likelihood. Analysis results show a strong correlation between the number of the substitution errors and the phoneme accuracy although the correlations of the number of the deletion and the insertion errors are low. Therefore, it is considered that there are overlaps between phonemes since the feature parameters vary at each speaking rate. For improving the phoneme accuracy, it is needed that we study a method which discriminates phonemes. On the other hand, although the correlation between the phoneme accuracy and the speaking rate seems to be low, a strong correlation between the speaking rate and the number of deletion errors and insertion errors are found. Since the number of the insertion errors and the number of the deletion errors were in the counterbalance relation, the correlation between the speaking rate and the phoneme accuracy was low. However, we consider that it is needed to normalize the speaking rate because the speaking rate influences on the number of the deletion and the insertion errors
Keywords
correlation methods; speech processing; speech recognition; correlation analysis; deletion errors; insertion errors; intra-speaker speech variability; phoneme accuracy; speaker-dependent speech recognition system; speaking rate; speech quality; Background noise; Cellular phones; Degradation; Frequency; Navigation; Signal processing; Speech analysis; Speech enhancement; Speech processing; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Signal Processing and Communications, 2006. ISPACS '06. International Symposium on
Conference_Location
Yonago
Print_ISBN
0-7803-9732-0
Electronic_ISBN
0-7803-9733-9
Type
conf
DOI
10.1109/ISPACS.2006.364831
Filename
4212218
Link To Document