DocumentCode :
705215
Title :
Robust isolated speech recognition using binary masks
Author :
Karadogan, Seliz Gulsen ; Larsen, Jan ; SyskindPedersen, Michael ; Boldt, Jesper Bunsow
Author_Institution :
Inf. & Math. Modelling, Tech. Univ. of Denmark, Lyngby, Denmark
fYear :
2010
fDate :
23-27 Aug. 2010
Firstpage :
1988
Lastpage :
1992
Abstract :
In this paper, we represent a new approach for robust speaker independent ASR using binary masks as feature vectors. This method is evaluated on an isolated digit database, TIDIGIT in three noisy environments (car, bottle and cafe noise types taken from the DRCD Sound Effects Library). Discrete Hidden Markov Models are used for the recognition and the observation vectors are quantized with the K-means algorithm using a Hamming distance. It is found that a recognition rate as high as 92% for clean speech is achievable using Ideal Binary Masks (IBM) where we assume prior target and noise information is available. We propose that using a Target Binary Mask (TBM), where only prior target information is needed, performs as good as using IBMs. We also propose a TBM estimation method based on target sound estimation using non-negative sparse coding (NNSC). The recognition results for TBMs with and without the estimation method for noisy conditions are evaluated and compared with those of using Mel Frequency Cepstral Coefficients (MFCC). It is observed that binary mask feature vectors are robust to noisy conditions.
Keywords :
hidden Markov models; speech recognition; Hamming distance; K-means algorithm; discrete hidden Markov models; ideal binary masks; isolated digit database; non-negative sparse coding; robust isolated speech recognition; robust speaker independent ASR; target binary mask; target sound estimation; Hidden Markov models; Mel frequency cepstral coefficient; Noise measurement; Signal to noise ratio; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference, 2010 18th European
Conference_Location :
Aalborg
ISSN :
2219-5491
Type :
conf
Filename :
7096488
Link To Document :
بازگشت