Title :
Extracting deep bottleneck features for visual speech recognition
Author :
Chao Sui ; Togneri, Roberto ; Bennamoun, Mohammed
Author_Institution :
Sch. of Comput. Sci. & Software Eng., Univ. of Western Australia, Crawley, WA, Australia
Abstract :
Motivated by the recent progresses in the use of deep learning techniques for acoustic speech recognition, we present in this paper a visual deep bottleneck feature (DBNF) learning scheme using a stacked auto-encoder combined with other techniques. Experimental results show that our proposed deep feature learning scheme yields approximately 24% relative improvement for visual speech accuracy. To the best of our knowledge, this is the first study which uses deep bottleneck feature on visual speech recognition. Our work firstly shows that the deep bottleneck visual feature is able to achieve a significant accuracy improvement on visual speech recognition.
Keywords :
speech recognition; deep bottleneck features; stacked auto-encoder; visual speech accuracy; visual speech recognition; Accuracy; Discrete cosine transforms; Feature extraction; Hidden Markov models; Speech; Speech recognition; Visualization; Visual speech recognition; deep bottleneck feature; stacked denoising auto-encoder;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178224