DocumentCode :
573553
Title :
A feature extraction method for speech recognition based on temporal tracking of clusters in spectro-temporal domain
Author :
Esfandian, Nafiseh ; Razzazi, Farbod ; Behrad, Alireza
Author_Institution :
Dept. of Electr. Eng., Islamic Azad Univ., Qaemshahr, Iran
fYear :
2012
fDate :
2-3 May 2012
Abstract :
In this paper, a novel approach is proposed for secondary feature extraction based on clusters tracking in spectro-temporal domain. Because of high dimensionality of the spectro-temporal features space, this domain is unsuitable for practical speech recognition systems. In order to reduce the dimensions of the feature space, weighted K-means (WKM) clustering technique is applied to spectro-temporal domain. The elements of mean vectors and covariance matrices of clusters are considered as the feature vector of each frame. However the cluster locations change gradually over the time. The main approach is based on the idea that the variations in clusters locations should be temporally tracked frame by frame and the parameters of these variations are considered in the extraction of secondary feature vectors of each speech frame. Several models are used to register the clusters in the new coming frame. In addition, a new architecture is proposed to classify the speech frames by a combining classifier using both tracked and non-tracked secondary features. The assessments were conducted for the proposed feature vectors on classification of several subsets of TIMIT database phonemes. Using tracked secondary feature vectors, the result was improved to 77.4% on voiced plosives classification which was relatively 1.8% higher than the results of non-tracked secondary feature vectors. The results on other subsets showed good improvement in classification rate too.
Keywords :
covariance matrices; feature extraction; pattern clustering; set theory; signal classification; speech recognition; vectors; TIMIT database phonemes; WKM clustering technique; covariance matrices; dimension reduction; mean vector; secondary feature vector extraction; spectrotemporal feature space; speech classification; speech recognition; subsets; temporal cluster tracking; voiced plosives classification; weighted K-means clustering technique; Feature extraction; Filter banks; Sorting; Spectrogram; Speech; Support vector machine classification; Vectors; Auditory system; Clustering methods; Feature extraction; Image matching; Speech processing; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on
Conference_Location :
Shiraz, Fars
Print_ISBN :
978-1-4673-1478-7
Type :
conf
DOI :
10.1109/AISP.2012.6313709
Filename :
6313709
Link To Document :
بازگشت