• DocumentCode
    2789949
  • Title

    A comparative study on methods of Weighted language model training for reranking lvcsr N-best hypotheses

  • Author

    Oba, Takanobu ; Hori, Takaaki ; Nakamura, Atsushi

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Japan
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    5126
  • Lastpage
    5129
  • Abstract
    This paper focuses on discriminative n-gram language models for a large vocabulary speech recognition task. Specifically we compare three training methods, Reranking Boosting (ReBst), Minimum Error Rate Training (MERT) and the Weighted Global Log-Linear Model (W-GCLM). They have a mechanism for handling sample weights, which are useful for providing an accurate model and work as impact factors of hypotheses for training. W-GCLM is proposed in this paper. We discuss the relationship between the three methods by comparing their loss functions. We also compare them experimentally by reranking N-best hypotheses under several conditions. We show that MERT and W-GCLM are different types of expansion of ReBst and have different respective advantages. Our experimental results reveal that W-GCLM outperforms ReBst and whether MERT or W-GCLM is superior depends on the training and test conditions.
  • Keywords
    speech recognition; N-best hypotheses reranking; discriminative n-gram language model; loss function; minimum error rate training; reranking boosting; vocabulary speech recognition task; weighted global log-linear model; weighted language model training; Boosting; Error analysis; Error correction; Laboratories; Lattices; Natural languages; Parameter estimation; Speech recognition; Testing; Vocabulary; Discriminative LM; Error Correction; MERT; Reranking Boost; Weighted GCLM;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495028
  • Filename
    5495028