DocumentCode
556733
Title
Parallel treebank from word-aligned bilingual corpus. Language engineering for phrasal alignments
Author
Colhon, Mihaela
Author_Institution
Dept. of Comput. Sci., Univ. of Craiova, Craiova, Romania
fYear
2011
fDate
14-16 Oct. 2011
Firstpage
1
Lastpage
6
Abstract
In this paper we describe a mechanism for parallel treebank generation between an intense studied language (i.e. English) and a less studied language, like Romanian. The Romanian constituents of the treebank are induced from the corresponding constituents of the English part taking into account the words alignments of the corpus. The proposed mechanism reuses and adjusts existing tools and algorithms for automatic Part-Of-Speech annotation and syntactic trees alignment.
Keywords
natural language processing; Romanian; language engineering; parallel treebank; part-of-speech annotation; phrasal alignments; syntactic trees alignment; word-aligned bilingual corpus; Europe; Natural language processing; Pragmatics; Proposals; Syntactics; Tagging; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
System Theory, Control, and Computing (ICSTCC), 2011 15th International Conference on
Conference_Location
Sinaia
Print_ISBN
978-1-4577-1173-2
Type
conf
Filename
6085680
Link To Document