DocumentCode :
3167528
Title :
Resource configurable spoken query detection using Deep Boltzmann Machines
Author :
Zhang, Yaodong ; Salakhutdinov, Ruslan ; Chang, Hung-An ; Glass, James
Author_Institution :
MIT Comput. Sci. & Artificial Intell. Lab., Cambridge, MA, USA
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
5161
Lastpage :
5164
Abstract :
In this paper we present a spoken query detection method based on posteriorgrams generated from Deep Boltzmann Machines (DBMs). The proposed method can be deployed in both semi-supervised and unsupervised training scenarios. The DBM-based posteriorgrams were evaluated on a series of keyword spotting tasks using the TIMIT speech corpus. In unsupervised training conditions, the DBM-approach improved upon our previous best unsupervised keyword detection performance using Gaussian mixture model-based posteriorgrams by over 10%. When limited amounts of labeled data were incorporated into training, the DBM-approach required less than one third of the annotated data in order to achieve a comparable performance of a system that used all of the annotated data for training.
Keywords :
Boltzmann machines; query processing; speech recognition; unsupervised learning; DBM-based posteriorgrams; TIMIT speech corpus; deep Boltzmann machines; keyword spotting tasks; labeled data; resource configurable spoken query detection; semisupervised training; unsupervised training; Data models; Machine learning; Probability; Speech; Speech recognition; Training; Vectors; Deep Boltzmann Machines; posteriorgram; spoken query detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6289082
Filename :
6289082
Link To Document :
بازگشت