• DocumentCode
    556733
  • Title

    Parallel treebank from word-aligned bilingual corpus. Language engineering for phrasal alignments

  • Author

    Colhon, Mihaela

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Craiova, Craiova, Romania
  • fYear
    2011
  • fDate
    14-16 Oct. 2011
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    In this paper we describe a mechanism for parallel treebank generation between an intense studied language (i.e. English) and a less studied language, like Romanian. The Romanian constituents of the treebank are induced from the corresponding constituents of the English part taking into account the words alignments of the corpus. The proposed mechanism reuses and adjusts existing tools and algorithms for automatic Part-Of-Speech annotation and syntactic trees alignment.
  • Keywords
    natural language processing; Romanian; language engineering; parallel treebank; part-of-speech annotation; phrasal alignments; syntactic trees alignment; word-aligned bilingual corpus; Europe; Natural language processing; Pragmatics; Proposals; Syntactics; Tagging; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Theory, Control, and Computing (ICSTCC), 2011 15th International Conference on
  • Conference_Location
    Sinaia
  • Print_ISBN
    978-1-4577-1173-2
  • Type

    conf

  • Filename
    6085680