• DocumentCode
    531595
  • Title

    A Framework for Co-classification of Articles and Users in Wikipedia

  • Author

    Liu, Lei ; Tan, Pang-Ning

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
  • Volume
    1
  • fYear
    2010
  • fDate
    Aug. 31 2010-Sept. 3 2010
  • Firstpage
    212
  • Lastpage
    215
  • Abstract
    The massive size of Wikipedia and the ease with which its content can be created and edited has made Wikipedia an interesting domain for a variety of classification tasks, including topic detection, spam detection, and vandalism detection. These tasks are typically cast into a link-based classification problem, in which the class label of an article or a user is determined from its content-based and link-based features. Prior works have focused primarily on classifying either the editors or the articles (but not both). Yet there are many situations in which the classification can be aided by knowing collectively the class labels of the users and articles (e.g., spammers are more likely to post spam content than non-spammers). This paper presents a novel framework to jointly classify the Wikipedia articles and editors, assuming there are correspondences between their classes. Our experimental results demonstrate that the proposed co-classification algorithm outperforms classifiers that are trained independently to predict the class labels of articles and editors.
  • Keywords
    pattern classification; search engines; Wikipedia; article classification; co-classification algorithm; content-based feature; link-based classification problem; link-based feature; spam detection; topic detection; user classification; vandalism detection; Link-based classification; Wikipedia;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on
  • Conference_Location
    Toronto, ON
  • Print_ISBN
    978-1-4244-8482-9
  • Electronic_ISBN
    978-0-7695-4191-4
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2010.223
  • Filename
    5616539