DocumentCode
2660152
Title
Better statistical estimation can benefit all phrases in phrase-based statistical machine translation
Author
Sima´an, Khalil ; Mylonakis, Markos
Author_Institution
Inst. for Logic, Univ. of Amsterdam, Amsterdam
fYear
2008
fDate
15-19 Dec. 2008
Firstpage
237
Lastpage
240
Abstract
The heuristic estimates of conditional phrase translation probabilities are based on frequency counts in a word-aligned parallel corpus. Earlier attempts at more principled estimation using Expectation-Maximization (EM) under perform this heuristic. This paper shows that a recently introduced novel estimator based on smoothing might provide a good alternative. When all phrase pairs are estimated (no length cut-off), this estimator slightly outperforms the heuristic estimator.
Keywords
expectation-maximisation algorithm; language translation; smoothing methods; conditional phrase translation probabilities; expectation-maximization; phrase-based statistical machine translation; smoothing methods; statistical estimation; word-aligned parallel corpus; Concurrent computing; Containers; Data mining; Frequency estimation; Logic; Parameter estimation; Probability; Smoothing methods; State estimation; Training data; Parameter Estimation; Smoothing Methods; Transduction;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language Technology Workshop, 2008. SLT 2008. IEEE
Conference_Location
Goa
Print_ISBN
978-1-4244-3471-8
Electronic_ISBN
978-1-4244-3472-5
Type
conf
DOI
10.1109/SLT.2008.4777884
Filename
4777884
Link To Document