Title :
Noise Robust Exemplar Matching Using Sparse Representations of Speech
Author :
Yilmaz, Ender ; Gemmeke, Jort F. ; Van hamme, Hugo
Author_Institution :
Electr. Eng. Dept. (ESAT), KU Leuven, Leuven, Belgium
Abstract :
Performing automatic speech recognition using exemplars (templates) holds the promise to provide a better duration and coarticulation modeling compared to conventional approaches such as hidden Markov models (HMMs). Exemplars are spectrographic representations of speech segments extracted from the training data, each associated with a speech unit, e.g. phones, syllables, half-words or words, and preserve the complete spectro-temporal content of the speech. Conventional exemplar-matching approaches to automatic speech recognition systems, such as those based on dynamic time warping, have typically focused on evaluation in clean conditions. In this paper, we propose a novel noise robust exemplar matching framework for automatic speech recognition. This recognizer approximates noisy speech segments as a weighted sum of speech and noise exemplars and performs recognition by comparing the reconstruction errors of different classes with respect to a divergence measure. We evaluate the system performance in keyword recognition on the small vocabulary track of the 2nd CHiME Challenge and connected digit recognition on the AURORA-2 database. The results show that the proposed system achieves comparable results with state-of-the-art noise robust recognition systems.
Keywords :
hidden Markov models; pattern matching; speech recognition; HMM; automatic speech recognition; coarticulation modeling; duration modeling; dynamic time warping; exemplar matching approaches; hidden Markov models; keyword recognition; noise exemplars; noise robust exemplar matching; small vocabulary track; sparse representations; spectrographic representations; speech segmentation; state-of-the-art noise robust recognition systems; Dictionaries; Hidden Markov models; Noise; Noise measurement; Speech; Speech processing; Speech recognition; Automatic speech recognition; exemplar-based; noise robustness; reconstruction error; sparse representations;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2329188