Title :
Exemplar-based large vocabulary speech recognition using k-nearest neighbors
Author :
Yanbo Xu ; Siohan, Olivier ; Simcha, David ; Kumar, Sanjiv ; Liao, Hank
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Maryland Coll. Park, College Park, MD, USA
Abstract :
This paper describes a large scale exemplar-based acoustic modeling approach for large vocabulary continuous speech recognition. We construct an index of labeled training frames using high-level features extracted from the bottleneck layer of a deep neural network as indexing features. At recognition time, each test frame is turned into a query and a set of k-nearest neighbor frames is retrieved from the index. This set is further filtered using majority voting and the remaining frames are used to derive an estimate of the context-dependent state posteriors of the query, which can then be used for recognition. Using an approximate nearest neighbor search approach based on asymmetric hashing, we are able to construct an index on over 25,000 hours of training data. We present both frame classification and recognition experiments on a Voice Search task.
Keywords :
feature extraction; file organisation; neural nets; speech recognition; vocabulary; voice equipment; acoustic modeling; asymmetric hashing; context-dependent state posteriors; deep neural network; feature extraction; k-nearest neighbor; recognition time; vocabulary speech recognition; voice search task; Electronic publishing; Indexes; Information services; Market research; Speech recognition; Training; Vocabulary; acoustic modeling; deep neural network; exemplar-based recognition; k-Nearest Neighbor;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178956