Title :
A scene-associated training method for mobile robot speech recognition in multisource reverberated environments
Author :
Liu, Jindong ; Johns, Edward ; Yang, Guang-Zhong
Author_Institution :
The Hamlyn Centre, Imperial College London, UK
Abstract :
In this paper, we present a new technique for social mobile robot speech recognition based on scene-associated training models. The key contribution of the paper is a real-time framework that reduces the effect of room reverberation and ambient noise, a challenging problem in speech recognition. In classical approaches, anechoic sound is used to train the model, with the main focus on removing reverberation or noise from the sound. Our technique differs in that we train a number of speech recognizers directly from the reverberated sound, by associating each recognizer with a unique visual scene, to deal with the varying reverberation properties of different rooms. By extracting local features from a captured image and recognizing a scene, the robot can use the appropriate speech recognizer that is trained for the particular structural properties of that scene. We tested our method by using a baseline speech recognition model (HTK) across a variety of rooms and different levels of background noise. The results show that the association between a visual scene and a corresponding speech recognizer greatly improves the robot´s speech recognition accuracy, together with increasing the computational speed of recognition, compared to competing techniques.
Keywords :
Databases; Feature extraction; Reverberation; Robots; Speech recognition; Training; Visualization;
Conference_Titel :
Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on
Conference_Location :
San Francisco, CA
Print_ISBN :
978-1-61284-454-1
DOI :
10.1109/IROS.2011.6094669