DocumentCode
3269903
Title
Multilayer Anchor Alignment in AC-E Parallel Corpora of Chinese Tea Classics
Author
Yi, Jiang ; Xin, Jiang ; Dapeng, Wang
Author_Institution
Sch. of Foreign Languages, Dalian Univ. of Technol., Dalian, China
Volume
2
fYear
2009
fDate
6-7 June 2009
Firstpage
498
Lastpage
501
Abstract
Chinese tea literature has not made its appearance in the existing corpora. Bilingual corpus of ancient Chinese and English (AC-E) also wait to be extended for purposes such as CAT and designing educational software for Confucius Institutes all over the world. This paper aims at such a corpus by demonstrating multilayer anchor-points in improving the alignment accuracy in a bilingual parallel corpus of tea classics. An experiment is carried out with four layers of ldquoanchor pointsrdquo. Technical terms as the first layer are extracted with Term List of Winalign module in Trados. The second is register-specific words with 1:1 co-occurrence frequency in SL and TL. The third and fourth are composed respectively of proper nouns and transliterated Chinese-unique words. Statistics show that the alignment accuracy keeps increasing with the step-up of each layer. Since such anchor-points are typical in ancient Chinese classics, this method can be generalized in relevant fields.
Keywords
literature; natural language processing; word processing; AC-E parallel corpora; Chinese tea classics; Chinese tea literature; cooccurrence frequency; multilayer anchor alignment; register-specific words; Computational intelligence; Concurrent computing; Educational institutions; Frequency; Natural languages; Nonhomogeneous media; Software design; Spine; Statistics; Thesauri; anchor point; bilingual parallel corpus; machine translation; multi-layer; sentence alignment;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Natural Computing, 2009. CINC '09. International Conference on
Conference_Location
Wuhan
Print_ISBN
978-0-7695-3645-3
Type
conf
DOI
10.1109/CINC.2009.78
Filename
5231285
Link To Document