DocumentCode :
1386198
Title :
Practical Detection of Spammers and Content Promoters in Online Video Sharing Systems
Author :
Benevenuto, Fabrício ; Rodrigues, Tiago ; Veloso, Adriano ; Almeida, Jussara ; Gonçalves, Marcos ; Almeida, Virgílio
Author_Institution :
Comput. Sci. Dept., Fed. Univ. of Ouro Preto, Ouro Preto, Brazil
Volume :
42
Issue :
3
fYear :
2012
fDate :
6/1/2012 12:00:00 AM
Firstpage :
688
Lastpage :
701
Abstract :
A number of online video sharing systems, out of which YouTube is the most popular, provide features that allow users to post a video as a response to a discussion topic. These features open opportunities for users to introduce polluted content, or simply pollution, into the system. For instance, spammers may post an unrelated video as response to a popular one, aiming at increasing the likelihood of the response being viewed by a larger number of users. Moreover, content promoters may try to gain visibility to a specific video by posting a large number of (potentially unrelated) responses to boost the rank of the responded video, making it appear in the top lists maintained by the system. Content pollution may jeopardize the trust of users on the system, thus compromising its success in promoting social interactions. In spite of that, the available literature is very limited in providing a deep understanding of this problem. In this paper, we address the issue of detecting video spammers and promoters. Towards that end, we first manually build a test collection of real YouTube users, classifying them as spammers, promoters, and legitimate users. Using our test collection, we provide a characterization of content, individual, and social attributes that help distinguish each user class. We then investigate the feasibility of using supervised classification algorithms to automatically detect spammers and promoters, and assess their effectiveness in our test collection. While our classification approach succeeds at separating spammers and promoters from legitimate users, the high cost of manually labeling vast amounts of examples compromises its full potential in realistic scenarios. For this reason, we further propose an active learning approach that automatically chooses a set of examples to label, which is likely to provide the highest amount of information, drastically reducing the amount of required training data while maintaining comparable classification effect- veness.
Keywords :
learning (artificial intelligence); pattern classification; social networking (online); YouTube; active learning approach; content promoters; online video sharing systems; polluted content; practical detection; supervised classification algorithm; video spammers detection; Crawlers; Electronic mail; Labeling; Measurement; Pollution; Tin; YouTube; Promoter; social media; social networks; spammer; video promotion; video response; video spam; Algorithms; Artificial Intelligence; Computer Simulation; Decision Support Techniques; Information Storage and Retrieval; Internet; Models, Theoretical; Online Systems; Pattern Recognition, Automated; Video Recording;
fLanguage :
English
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
1083-4419
Type :
jour
DOI :
10.1109/TSMCB.2011.2173799
Filename :
6093756
Link To Document :
بازگشت