DocumentCode
3722216
Title
Real-Time Twitter Content Polluter Detection Based on Direct Features
Author
Weiling Chen;Chai Kiat Yeo;Chiew Tong Lau;Bu Sung Lee
Author_Institution
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
fYear
2015
Firstpage
1
Lastpage
4
Abstract
Too many content polluters on social networks make it difficult for users to browse valuable contents. Some research has been done in spam and phishing detection on social networks but these are only a small part of all content polluters. What bother users most are those large amount of repeated low quality advertisements. Hence it is necessary to filter these content polluters to improve users´ experiences. Moreover, most of the phishing/spam detection works are done offline and some of the features used take too much time to extract making it impossible for real-time detection. We perform a study on an extensive twitter dataset and present a definition of content polluters. We further propose some novel features and together with other commonly used features in phishing/spam detection, we classify them into two categories - direct features and indirect features. A simple random forest classifier is applied based on our proposed direct features alone for real-time content polluter detection and it achieves a reasonable high accuracy with high F1 values.
Keywords
"Feature extraction","Twitter","Real-time systems","Labeling","Electronic mail","Training"
Publisher
ieee
Conference_Titel
Information Science and Security (ICISS), 2015 2nd International Conference on
Type
conf
DOI
10.1109/ICISSEC.2015.7371027
Filename
7371027
Link To Document