• DocumentCode
    124245
  • Title

    Word Sense Induction with Multilingual Features Representation

  • Author

    Albano, Luca ; Beneventano, Domenico ; Bergamaschi, Sonia

  • Author_Institution
    DIEF, Univ. of Modena & Reggio Emilia, Modena, Italy
  • Volume
    2
  • fYear
    2014
  • fDate
    11-14 Aug. 2014
  • Firstpage
    343
  • Lastpage
    349
  • Abstract
    The use of word senses in place of surface word forms has been shown to improve performance on many computational tasks, including intelligent web search. In this paper we propose a novel approach to automatic discovery of word senses from raw text, a task referred to as Word Sense Induction (WSI). Almost all the WSI approaches proposed in the literature dealt with monolingual data and only very few proposals incorporate bilingual data. The WSI method we propose is innovative as use multi-lingual data to perform WSI of words in a given language. The experiments show a clear overall improvement of the performance: the single-language setting is outperformed by the multi-language settings on almost all the considered target words. The performance gain, in terms of F-Measure, has an average value of 5% and in some cases it reaches 40%.
  • Keywords
    natural language processing; pattern clustering; text analysis; word processing; F-measure; WSI; context clustering; multilingual data; multilingual feature representation; word sense induction; Clustering algorithms; Context; Noise; Performance gain; Testing; Training; Vectors; Clustering; Multilingual; Web Search; Word Sense Disambiguation; Word Sense Induction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Warsaw
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2014.117
  • Filename
    6927644