Title :
Graph-based semi-supervised acoustic modeling in DNN-based speech recognition
Author :
Yuzong Liu ; Kirchhoff, Katrin
Author_Institution :
Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
Abstract :
This paper describes the combination of two recent machine learning techniques for acoustic modeling in speech recognition: deep neural networks (DNNs) and graph-based semi-supervised learning (SSL). While DNNs have been shown to be powerful supervised classifiers and have achieved considerable success in speech recognition, graph-based SSL can exploit valuable complementary information derived from the manifold structure of the unlabeled test data. Previous work on graph-based SSL in acoustic modeling has been limited to frame-level classification tasks and has not been compared to, or integrated with, state-of-the-art DNN/HMM recognizers. This paper represents the first integration of graph-based SSL with DNN based speech recognition and analyzes its effect on word recognition performance. The approach is evaluated on two small vocabulary speech recognition tasks and shows a significant improvement in HMM state classification accuracy as well as a consistent reduction in word error rate over a state-of-the-art DNN/HMM baseline.
Keywords :
graph theory; hidden Markov models; learning (artificial intelligence); neural nets; pattern classification; speech recognition; DNN-based speech recognition; HMM state classification accuracy; acoustic modeling; deep neural networks; frame-level classification tasks; graph-based SSL; graph-based semisupervised acoustic modeling; graph-based semisupervised learning; machine learning techniques; small vocabulary speech recognition tasks; supervised classifiers; unlabeled test data; word recognition performance; Accuracy; Acoustics; Feature extraction; Hidden Markov models; Speech recognition; Training; Vectors; Acoustic modeling; deep neural networks; graph-based learning; semi-supervised learning;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078570