Title :
Detection-based speech recognition with sparse point process models
Author :
Jansen, Aren ; Niyogi, Partha
Author_Institution :
HLT Center of Excellence, Johns Hopkins Univ., Baltimore, MD, USA
Abstract :
We present a bottom-up approach to connected digit recognition in which (i) the speech signal is transformed into a sparse set of acoustic events in time, (ii) point process models (PPM) of these events are used to detect candidate digit occurrences, and (iii) the candidate digit detections are reduced to a single digit sequence prediction by using a previously proposed graph-based optimization. We find the performance of this detection-based system on the AURORA2 evaluation matches that of an HTK baseline in clean speech and provides improved robustness to non-stationary noise. A similar robustness to stationary noise sources is achieved with unsupervised PPM adaptation using small amounts of the noisy data.
Keywords :
optimisation; speech recognition; AURORA2 evaluation matches; HTK baseline; acoustic events; connected digit recognition; detection-based speech recognition; detection-based system; graph-based optimization; noisy data; sparse point process models; speech signal; stationary noise sources; unsupervised PPM adaptation; Acoustic signal detection; Decoding; Detectors; Event detection; Hidden Markov models; Noise robustness; Predictive models; Speech processing; Speech recognition; Vocabulary; speech processing; speech recognition;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495636