• DocumentCode
    672393
  • Title

    Discriminative semi-supervised training for keyword search in low resource languages

  • Author

    Hsiao, Ruey-Chang ; Ng, Timothy ; Grezl, Frantisek ; Karakos, Damianos ; Tsakalidis, Stavros ; Nguyen, L. ; Schwartz, R.

  • Author_Institution
    Raytheon BBN Technol., Cambridge, MA, USA
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    440
  • Lastpage
    445
  • Abstract
    In this paper, we investigate semi-supervised training for low resource languages where the initial systems may have high error rate (≥ 70.0% word eror rate). To handle the lack of data, we study semi-supervised techniques including data selection, data weighting, discriminative training and multilayer perceptron learning to improve system performance. The entire suite of semi-supervised methods presented in this paper was evaluated under the IARPA Babel program for the keyword spotting tasks. Our semi-supervised system had the best performance in the OpenKWS13 surprise language evaluation for the limited condition. In this paper, we describe our work on the Turkish and Vietnamese systems.
  • Keywords
    learning (artificial intelligence); multilayer perceptrons; natural language processing; speech processing; IARPA Babel program; OpenKWS13 surprise language evaluation; Turkish system; Vietnamese system; data selection; data weighting; discriminative semisupervised training; discriminative training; keyword search; keyword spotting task; low resource languages; multilayer perceptron learning; system performance improvement; word eror rate; Adaptation models; Data models; Lattices; Mathematical model; Speech; Speech recognition; Training; keyword spotting; low resource languages; semi-supervised training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707770
  • Filename
    6707770