DocumentCode :
2213533
Title :
Spectro-temporal analysis of speech for Spanish phoneme recognition
Author :
Sharifzadeh, Sara ; Serrano, Javier ; Carrabina, Jordi
Author_Institution :
Microelectron. Dept., Univ. Autonoma de Barcelona, Barcelona, Spain
fYear :
2012
fDate :
11-13 April 2012
Firstpage :
548
Lastpage :
551
Abstract :
State of the art speech recognition systems (ASR), mostly use Mel-Frequency cepstral coefficients (MFCC), as acoustic features. In this paper, we propose a new discriminative analysis of acoustic features, based on spectrogram analysis. Both spectral and temporal variations of speech signal are considered. This has improved the recognition performance especially in case of noisy situation and phonemes with time domain modulations such as stops. In this method, the 2D Discrete Cosine Transform (DCT) is applied on small overlapped 2D Hamming windowed patches of spectrogram of Spanish phonemes and enhanced by means of bi-cubic interpolation. An adaptive strategy is proposed for the size of patches over the time to construct unique length vectors for different phonemes. These vectors are classified based on K-nearest neighbor (KNN) and linear discriminative analysis (LDA) and reduced rank LDA (RLDA). Experimental results demonstrate improvement in recognition performance for noisy speech signals and stops.
Keywords :
discrete cosine transforms; interpolation; speech recognition; 2D discrete cosine transform; ASR; DCT; K-nearest neighbor; KNN; MFCC; Mel-Frequency cepstral coefficients; RLDA; Spanish phoneme recognition; Spanish phoneme spectrogram; acoustic feature discriminative analysis; adaptive strategy; bicubic interpolation; length vectors; linear discriminative analysis; reduced rank LDA; small overlapped 2D Hamming windowed patches; spectrogram analysis; speech recognition systems; speech signal; speech spectro-temporal variation analysis; time domain modulations; Discrete cosine transforms; Feature extraction; Mel frequency cepstral coefficient; Noise measurement; Spectrogram; Time frequency analysis; Vectors; Automatic speech recognition; DCT transform; MFCC; Spectrogram; TF;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Signals and Image Processing (IWSSIP), 2012 19th International Conference on
Conference_Location :
Vienna
ISSN :
2157-8672
Print_ISBN :
978-1-4577-2191-5
Type :
conf
Filename :
6208200
Link To Document :
بازگشت