Spoken digits recognition by subspace decomposition method

Author

Kusakari, K. ; Kurihara, K. ; Murakami, T. ; Ishida, Y.

Author_Institution

Department of Electronics and Communication, Meiji University, 1-1-1 Higashimita, Tama-ku, Kawasaki-shi, Kanagawa, 214-8571, Japan

Volume

A

fYear

2004

fDate

24-24 Nov. 2004

Firstpage

72

Lastpage

75

Abstract

In this paper, we propose a method for spoken digits recognition using dynamic programming (DP) matching combined with subspace decomposition, which linearly separates phonetic information from speech data based on principal component analysis (PCA). This method is capable of more robust speech recognition of less reference speech patterns. The use of the spectral envelope by linear predictive coding (LPC) in speech recognition is unable to avoid errors in recognition due to the uncertainty of personalities, the dynamic variation of features, and so on. By using the subspace method, the proposed method eliminates these problems and enables good recognition results of less standard speech patterns. We use DP matching in recognizing, because it is capable of more efficient pattern matching by normalizing the length of vowels. Simulation results show that the proposed method, using projection onto phonetic subspace with less speaker information, is superior to the conventional method using spectral envelopes, which is obtained by LPC, and DP matching. Projection onto phonetic subspace is a kind of feature vector that contains less speaker information.

Keywords

Dynamic programming; Euclidean distance; Linear predictive coding; Pattern matching; Pattern recognition; Principal component analysis; Robustness; Speech analysis; Speech recognition; Uncertainty;

fLanguage

English

Publisher

ieee

Conference_Titel

TENCON 2004. 2004 IEEE Region 10 Conference

Conference_Location

Chiang Mai

Print_ISBN

0-7803-8560-8

Type

conf

DOI

10.1109/TENCON.2004.1414359

Filename

1414359