• DocumentCode
    3716064
  • Title

    MT-based artificial hypothesis generation for unsupervised discriminative language modeling

  • Author

    Erinç Dikici;Murat Saraçlar

  • Author_Institution
    Bogazici University, Department of Electrical and Electronics Engineering, 34342, Bebek, Istanbul, Turkey
  • fYear
    2015
  • Firstpage
    1401
  • Lastpage
    1405
  • Abstract
    Discriminative language modeling (DLM) is used as a postprocessing step to correct automatic speech recognition (ASR) errors. Traditional DLM training requires a large number of ASR N-best lists together with their reference transcriptions. It is possible to incorporate additional text data into training via artificial hypothesis generation through confusion modeling. A weighted finite-state transducer (WFST) or a machine translation (MT) system can be used to generate the artificial hypotheses. When the reference transcriptions are not available, training can be done in an unsupervised way via a target output selection scheme. In this paper we adapt the MT-based artificial hypothesis generation approach to un-supervised discriminative language modeling, and compare it with the WFST-based setting. We achieve improvements in word error rate of up to 0.7% over the generative baseline, which is significant at p <; 0.001.
  • Keywords
    "Training","Adaptation models","Data models","Europe","Signal processing","Speech","Manuals"
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO), 2015 23rd European
  • Electronic_ISBN
    2076-1465
  • Type

    conf

  • DOI
    10.1109/EUSIPCO.2015.7362614
  • Filename
    7362614