• DocumentCode
    2140145
  • Title

    Bilingual seed lexicon adaptation for entity translation extraction

  • Author

    Wei Wang ; Tiejun Zhao ; Chunyue Zhang

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
  • fYear
    2013
  • fDate
    23-25 July 2013
  • Firstpage
    1309
  • Lastpage
    1313
  • Abstract
    Bilingual seed lexicon, which is considered as a bridge between two languages, is one of the main resources used for entity translation extraction tasks from comparable corpora. However, little attention has been paid to this lexicon except its coverage. In fact, the quality of the seed lexicon is one of the key factors that affect the accuracy of entity translation extraction. In this paper, we propose a new self-adaptive model. We use a word segmentation technique to adapt segmented corpora and then propose two strategies of weight allocation and corresponding filter. Experiments demonstrate that our technique significantly outperforms the standard approach.
  • Keywords
    language translation; linguistics; natural language processing; text analysis; bilingual seed lexicon adaptation; entity translation extraction; segmented corpora; self-adaptive model; weight allocation; word segmentation technique; Context; Correlation; Noise; Radio spectrum management; Resource management; Standards; Vectors; adaptation; comparable corpora; entity translation extraction; seed lexicon;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Computation (ICNC), 2013 Ninth International Conference on
  • Conference_Location
    Shenyang
  • Type

    conf

  • DOI
    10.1109/ICNC.2013.6818181
  • Filename
    6818181