Title :
Spoken digits recognition by subspace decomposition method
Author :
Kusakari, K. ; Kurihara, K. ; Murakami, T. ; Ishida, Y.
Author_Institution :
Department of Electronics and Communication, Meiji University, 1-1-1 Higashimita, Tama-ku, Kawasaki-shi, Kanagawa, 214-8571, Japan
Abstract :
In this paper, we propose a method for spoken digits recognition using dynamic programming (DP) matching combined with subspace decomposition, which linearly separates phonetic information from speech data based on principal component analysis (PCA). This method is capable of more robust speech recognition of less reference speech patterns. The use of the spectral envelope by linear predictive coding (LPC) in speech recognition is unable to avoid errors in recognition due to the uncertainty of personalities, the dynamic variation of features, and so on. By using the subspace method, the proposed method eliminates these problems and enables good recognition results of less standard speech patterns. We use DP matching in recognizing, because it is capable of more efficient pattern matching by normalizing the length of vowels. Simulation results show that the proposed method, using projection onto phonetic subspace with less speaker information, is superior to the conventional method using spectral envelopes, which is obtained by LPC, and DP matching. Projection onto phonetic subspace is a kind of feature vector that contains less speaker information.
Keywords :
Dynamic programming; Euclidean distance; Linear predictive coding; Pattern matching; Pattern recognition; Principal component analysis; Robustness; Speech analysis; Speech recognition; Uncertainty;
Conference_Titel :
TENCON 2004. 2004 IEEE Region 10 Conference
Conference_Location :
Chiang Mai
Print_ISBN :
0-7803-8560-8
DOI :
10.1109/TENCON.2004.1414359