DocumentCode :
730807
Title :
Exemplar-based large vocabulary speech recognition using k-nearest neighbors
Author :
Yanbo Xu ; Siohan, Olivier ; Simcha, David ; Kumar, Sanjiv ; Liao, Hank
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Maryland Coll. Park, College Park, MD, USA
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
5167
Lastpage :
5171
Abstract :
This paper describes a large scale exemplar-based acoustic modeling approach for large vocabulary continuous speech recognition. We construct an index of labeled training frames using high-level features extracted from the bottleneck layer of a deep neural network as indexing features. At recognition time, each test frame is turned into a query and a set of k-nearest neighbor frames is retrieved from the index. This set is further filtered using majority voting and the remaining frames are used to derive an estimate of the context-dependent state posteriors of the query, which can then be used for recognition. Using an approximate nearest neighbor search approach based on asymmetric hashing, we are able to construct an index on over 25,000 hours of training data. We present both frame classification and recognition experiments on a Voice Search task.
Keywords :
feature extraction; file organisation; neural nets; speech recognition; vocabulary; voice equipment; acoustic modeling; asymmetric hashing; context-dependent state posteriors; deep neural network; feature extraction; k-nearest neighbor; recognition time; vocabulary speech recognition; voice search task; Electronic publishing; Indexes; Information services; Market research; Speech recognition; Training; Vocabulary; acoustic modeling; deep neural network; exemplar-based recognition; k-Nearest Neighbor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178956
Filename :
7178956
Link To Document :
بازگشت