Title :
Speech recognition by indexing and sequencing
Author :
Franzini, Simone ; Ben-Arie, Jezekiel
Author_Institution :
Machine Vision Lab., Univ. of Illinois at Chicago, Chicago, IL, USA
Abstract :
Recognition by Indexing and Sequencing (RISq) is a general-purpose method for classification of temporal vector sequences. We developed an advanced version of RISq and applied it to isolated-word speech recognition, a task most commonly performed with Hidden Markov Models (HMMs) or Dynamic Time Warping (DTW). RISq is substantially different from both these methods and presents several advantages over them: robust recognition can be achieved with only a few samples from the input sequence and training can be carried out with one or more examples per class. This enables much faster training and also allows to recognize speech with a variety of accents. A two-step classification algorithm is used: first the training samples closest to each input sample are identified and weighted with a parallel algorithm (indexing). Then a maximum weighted bipartite graph matching is found between the input sequence and a training sequence, respecting an additional temporal constraint (sequencing). We discuss the application of RISq to speech recognition and compare its architecture and performance with that of Sphinx, a state-of-the-art speech recognizer based on HMMs.
Keywords :
graph theory; hidden Markov models; speech recognition; DTW; HMM; RISq; bipartite graph matching; dynamic time warping; hidden Markov models; isolated word speech recognition; parallel algorithm; recognition by indexing and sequencing; robust recognition; speech recognition; speech recognizer; temporal vector sequences; Data structures; Feature extraction; Hidden Markov models; Indexing; Speech; Speech recognition; Training; RISq; recognition by indexing and sequencing; speech recognition;
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2010 International Conference of
Conference_Location :
Paris
Print_ISBN :
978-1-4244-7897-2
DOI :
10.1109/SOCPAR.2010.5686409