DocumentCode
531595
Title
A Framework for Co-classification of Articles and Users in Wikipedia
Author
Liu, Lei ; Tan, Pang-Ning
Author_Institution
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
Volume
1
fYear
2010
fDate
Aug. 31 2010-Sept. 3 2010
Firstpage
212
Lastpage
215
Abstract
The massive size of Wikipedia and the ease with which its content can be created and edited has made Wikipedia an interesting domain for a variety of classification tasks, including topic detection, spam detection, and vandalism detection. These tasks are typically cast into a link-based classification problem, in which the class label of an article or a user is determined from its content-based and link-based features. Prior works have focused primarily on classifying either the editors or the articles (but not both). Yet there are many situations in which the classification can be aided by knowing collectively the class labels of the users and articles (e.g., spammers are more likely to post spam content than non-spammers). This paper presents a novel framework to jointly classify the Wikipedia articles and editors, assuming there are correspondences between their classes. Our experimental results demonstrate that the proposed co-classification algorithm outperforms classifiers that are trained independently to predict the class labels of articles and editors.
Keywords
pattern classification; search engines; Wikipedia; article classification; co-classification algorithm; content-based feature; link-based classification problem; link-based feature; spam detection; topic detection; user classification; vandalism detection; Link-based classification; Wikipedia;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on
Conference_Location
Toronto, ON
Print_ISBN
978-1-4244-8482-9
Electronic_ISBN
978-0-7695-4191-4
Type
conf
DOI
10.1109/WI-IAT.2010.223
Filename
5616539
Link To Document