DocumentCode
3124356
Title
Keyword-specific normalization based keyword spotting for spontaneous speech
Author
Weifeng Li ; Qingmin Liao
Author_Institution
Dept. of Electron. Eng., Tsinghua Univ., Shenzhen, China
fYear
2012
fDate
5-8 Dec. 2012
Firstpage
233
Lastpage
237
Abstract
This paper presents a novel architecture for keyword spotting in spontaneous speech, in which keyword model is trained from a small number of acoustic examples provided by a user. The word-spotting architecture relies on scoring patch feature vector sequences extracted by using sliding windows, and performing keyword-specific normalization and threshold setting. Dynamic time warping (DTW) based template matching and Gaussian Mixture Models (GMM) are proposed to model the keyword, and another GMM is proposed to model the non-keywords. Our keyword spotting experiments demonstrate the effectiveness of the proposed methods. More specifically, the proposed GMM log-likelihood ratio based method achieves about 17% absolute improvement in terms of recall rates compared to the baseline system.
Keywords
Bayes methods; Gaussian processes; feature extraction; hidden Markov models; pattern matching; speech processing; speech recognition; Bayesian information criterion; DTW; GMM log-likelihood ratio based method; Gaussian mixture models; dynamic time warping based template matching; keyword model; keyword-specific normalization based keyword spotting; phonetic hidden Markov model; scoring patch feature vector sequence extraction; sliding windows; speech utterance; spontaneous speech; threshold setting; word-spotting architecture; Acoustics; Data models; Hidden Markov models; Speech; Training; Training data; Vectors; Bayesian Information Criterion; Gaussian mixture model; Keyword spotting; dynamic time warping; sliding window;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location
Kowloon
Print_ISBN
978-1-4673-2506-6
Electronic_ISBN
978-1-4673-2505-9
Type
conf
DOI
10.1109/ISCSLP.2012.6423490
Filename
6423490
Link To Document