DocumentCode :
3309165
Title :
Sentence Splitting for Vietnamese-English Machine Translation
Author :
Hung, Bui Thanh ; Minh, Nguyen Le ; Shimazu, Akira
Author_Institution :
Grad. Sch. of Inf. Sci., Japan Adv. Inst. of Sci. & Technol., Ishikawa, Japan
fYear :
2012
fDate :
17-19 Aug. 2012
Firstpage :
156
Lastpage :
160
Abstract :
Translation quality is often disappointed when a phrase based machine translation system deals with long sentences. Because of syntactic structure discrepancy between two languages, the translation output will not preserve the same word order as the source. When a sentence is long, it should be partitioned into several clauses and the word reordering in the translation should be done within clauses, not between clauses. In this paper, a rule-based technique is proposed to split long Vietnamese sentences based on linguistic information. We use splitting boundaries for translating sentences with two type of constrains: wall and zone. This method is useful for preserving word order and improving translation quality. We describe experiments on translation from Vietnamese to English, showing an improvement BLEU and NIST score.
Keywords :
knowledge based systems; language translation; BLEU score improvement; NIST score improvement; Vietnamese-English machine translation; linguistic information; phrase based machine translation system; rule-based technique; sentence splitting; sentence translation; splitting boundaries; syntactic structure discrepancy; translation quality improvement; word order preservation; word reordering; Barium; Computational modeling; Context; Decoding; NIST; Pragmatics; Training; phrase-based machine translation; rule-based sentence splitting; wall and zone constraints;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge and Systems Engineering (KSE), 2012 Fourth International Conference on
Conference_Location :
Danang
Print_ISBN :
978-1-4673-2171-6
Type :
conf
DOI :
10.1109/KSE.2012.28
Filename :
6299413
Link To Document :
بازگشت