Title :
Discriminative semi-supervised training for keyword search in low resource languages
Author :
Hsiao, Ruey-Chang ; Ng, Timothy ; Grezl, Frantisek ; Karakos, Damianos ; Tsakalidis, Stavros ; Nguyen, L. ; Schwartz, R.
Author_Institution :
Raytheon BBN Technol., Cambridge, MA, USA
Abstract :
In this paper, we investigate semi-supervised training for low resource languages where the initial systems may have high error rate (≥ 70.0% word eror rate). To handle the lack of data, we study semi-supervised techniques including data selection, data weighting, discriminative training and multilayer perceptron learning to improve system performance. The entire suite of semi-supervised methods presented in this paper was evaluated under the IARPA Babel program for the keyword spotting tasks. Our semi-supervised system had the best performance in the OpenKWS13 surprise language evaluation for the limited condition. In this paper, we describe our work on the Turkish and Vietnamese systems.
Keywords :
learning (artificial intelligence); multilayer perceptrons; natural language processing; speech processing; IARPA Babel program; OpenKWS13 surprise language evaluation; Turkish system; Vietnamese system; data selection; data weighting; discriminative semisupervised training; discriminative training; keyword search; keyword spotting task; low resource languages; multilayer perceptron learning; system performance improvement; word eror rate; Adaptation models; Data models; Lattices; Mathematical model; Speech; Speech recognition; Training; keyword spotting; low resource languages; semi-supervised training;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location :
Olomouc
DOI :
10.1109/ASRU.2013.6707770