• DocumentCode
    3727511
  • Title

    Entity linking and name disambiguation using SVM in Chinese micro-blogs

  • Author

    Jinlan Fu; Jie Qiu; Yunlong Guo; Li Li

  • Author_Institution
    Institute of Logic and Intelligence, Dept. of Computer Science, Southwest University, Chongqing 400715, China
  • fYear
    2015
  • Firstpage
    468
  • Lastpage
    472
  • Abstract
    Social media has been increasing sharply with the development of Web2.0. Entity disambiguation has attracted great attentions recently. Understanding Chinese micro-blogs is a challenging due to the inherent feature of Chinese language, the informal usage of the language and the wide variety of contents it covers. With BaiDu Encyclopedia Webpages, the info and label items, the basics for disambiguation are obtained. BaiDu Encyclopedia and Wikipedia Encyclopedia are used to build the mapping table. Entity disambiguation includes two tasks in this paper: (1)linking entities is mainly based on the created mapping table; (2)removing ambiguities of entities from micro-blogs is crucial in entity reorganization. An improved label disambiguation algorithm is proposed. Binomial classification based on Chinese Family Names is introduced for improvement. SVM model is applied for classifying the to-be-tested entity. We evaluate our method on the open data sets provided by NLP&&CC 2014. We achieved 84.02% in terms of accuracy. The average accuracy rate of all teams is 70.58%. Ours is much higher than the average level. It shows that the proposed method is promising. Our work can provide invaluable insights into entity disambiguation in Chinese micro-blogs.
  • Keywords
    "Encyclopedias","Joining processes","Knowledge based systems","Support vector machines","Internet","Electronic publishing"
  • Publisher
    ieee
  • Conference_Titel
    Natural Computation (ICNC), 2015 11th International Conference on
  • Electronic_ISBN
    2157-9563
  • Type

    conf

  • DOI
    10.1109/ICNC.2015.7378034
  • Filename
    7378034