DocumentCode
2800318
Title
Detection-based speech recognition with sparse point process models
Author
Jansen, Aren ; Niyogi, Partha
Author_Institution
HLT Center of Excellence, Johns Hopkins Univ., Baltimore, MD, USA
fYear
2010
fDate
14-19 March 2010
Firstpage
4362
Lastpage
4365
Abstract
We present a bottom-up approach to connected digit recognition in which (i) the speech signal is transformed into a sparse set of acoustic events in time, (ii) point process models (PPM) of these events are used to detect candidate digit occurrences, and (iii) the candidate digit detections are reduced to a single digit sequence prediction by using a previously proposed graph-based optimization. We find the performance of this detection-based system on the AURORA2 evaluation matches that of an HTK baseline in clean speech and provides improved robustness to non-stationary noise. A similar robustness to stationary noise sources is achieved with unsupervised PPM adaptation using small amounts of the noisy data.
Keywords
optimisation; speech recognition; AURORA2 evaluation matches; HTK baseline; acoustic events; connected digit recognition; detection-based speech recognition; detection-based system; graph-based optimization; noisy data; sparse point process models; speech signal; stationary noise sources; unsupervised PPM adaptation; Acoustic signal detection; Decoding; Detectors; Event detection; Hidden Markov models; Noise robustness; Predictive models; Speech processing; Speech recognition; Vocabulary; speech processing; speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495636
Filename
5495636
Link To Document