DocumentCode
672393
Title
Discriminative semi-supervised training for keyword search in low resource languages
Author
Hsiao, Ruey-Chang ; Ng, Timothy ; Grezl, Frantisek ; Karakos, Damianos ; Tsakalidis, Stavros ; Nguyen, L. ; Schwartz, R.
Author_Institution
Raytheon BBN Technol., Cambridge, MA, USA
fYear
2013
fDate
8-12 Dec. 2013
Firstpage
440
Lastpage
445
Abstract
In this paper, we investigate semi-supervised training for low resource languages where the initial systems may have high error rate (≥ 70.0% word eror rate). To handle the lack of data, we study semi-supervised techniques including data selection, data weighting, discriminative training and multilayer perceptron learning to improve system performance. The entire suite of semi-supervised methods presented in this paper was evaluated under the IARPA Babel program for the keyword spotting tasks. Our semi-supervised system had the best performance in the OpenKWS13 surprise language evaluation for the limited condition. In this paper, we describe our work on the Turkish and Vietnamese systems.
Keywords
learning (artificial intelligence); multilayer perceptrons; natural language processing; speech processing; IARPA Babel program; OpenKWS13 surprise language evaluation; Turkish system; Vietnamese system; data selection; data weighting; discriminative semisupervised training; discriminative training; keyword search; keyword spotting task; low resource languages; multilayer perceptron learning; system performance improvement; word eror rate; Adaptation models; Data models; Lattices; Mathematical model; Speech; Speech recognition; Training; keyword spotting; low resource languages; semi-supervised training;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location
Olomouc
Type
conf
DOI
10.1109/ASRU.2013.6707770
Filename
6707770
Link To Document