• DocumentCode
    2660152
  • Title

    Better statistical estimation can benefit all phrases in phrase-based statistical machine translation

  • Author

    Sima´an, Khalil ; Mylonakis, Markos

  • Author_Institution
    Inst. for Logic, Univ. of Amsterdam, Amsterdam
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    237
  • Lastpage
    240
  • Abstract
    The heuristic estimates of conditional phrase translation probabilities are based on frequency counts in a word-aligned parallel corpus. Earlier attempts at more principled estimation using Expectation-Maximization (EM) under perform this heuristic. This paper shows that a recently introduced novel estimator based on smoothing might provide a good alternative. When all phrase pairs are estimated (no length cut-off), this estimator slightly outperforms the heuristic estimator.
  • Keywords
    expectation-maximisation algorithm; language translation; smoothing methods; conditional phrase translation probabilities; expectation-maximization; phrase-based statistical machine translation; smoothing methods; statistical estimation; word-aligned parallel corpus; Concurrent computing; Containers; Data mining; Frequency estimation; Logic; Parameter estimation; Probability; Smoothing methods; State estimation; Training data; Parameter Estimation; Smoothing Methods; Transduction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop, 2008. SLT 2008. IEEE
  • Conference_Location
    Goa
  • Print_ISBN
    978-1-4244-3471-8
  • Electronic_ISBN
    978-1-4244-3472-5
  • Type

    conf

  • DOI
    10.1109/SLT.2008.4777884
  • Filename
    4777884