DocumentCode :
177466
Title :
Kernel methods match Deep Neural Networks on TIMIT
Author :
Po-Sen Huang ; Avron, Haim ; Sainath, Tara N. ; Sindhwani, Vikas ; Ramabhadran, Bhuvana
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
205
Lastpage :
209
Abstract :
Despite their theoretical appeal and grounding in tractable convex optimization techniques, kernel methods are often not the first choice for large-scale speech applications due to their significant memory requirements and computational expense. In recent years, randomized approximate feature maps have emerged as an elegant mechanism to scale-up kernel methods. Still, in practice, a large number of random features is required to obtain acceptable accuracy in predictive tasks. In this paper, we develop two algorithmic schemes to address this computational bottleneck in the context of kernel ridge regression. The first scheme is a specialized distributed block coordinate descent procedure that avoids the explicit materialization of the feature space data matrix, while the second scheme gains efficiency by combining multiple weak random feature models in an ensemble learning framework. We demonstrate that these schemes enable kernel methods to match the performance of state of the art Deep Neural Networks on TIMIT for speech recognition and classification tasks. In particular, we obtain the best classification error rates reported on TIMIT using kernel methods.
Keywords :
learning (artificial intelligence); neural nets; optimisation; regression analysis; speech recognition; TIMIT; deep neural networks; ensemble learning framework; feature space data matrix; kernel methods; kernel ridge regression; large-scale speech applications; multiple weak random feature models; randomized approximate feature maps; specialized distributed block coordinate descent procedure; speech classification tasks; speech recognition; tractable convex optimization techniques; Computational modeling; Hidden Markov models; Kernel; Neural networks; Speech recognition; Training; Training data; deep learning; distributed computing; large-scale kernel machines; random features; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6853587
Filename :
6853587
Link To Document :
بازگشت