Title :
Dynamic Splog Filtering algorithm Based on Combinational Features
Author :
Ren, Yong-gong ; Yin, Ming-fei ; Wang, Jian
Author_Institution :
Sch. of Comput. & Inf. Technol., Liaoning Normal Univ., Dalian, China
Abstract :
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms. The existing algorithms of identifying splogs based on lexical frequency features which are quite redundancy and lack correlation, degrades blog search results as well as wastes network resources. In our approach we exploit a dynamic filtering algorithm based on the combinational features of splog(CFDS) to detect splogs. CFDS algorithm selects several efficient novel features such as self- similarity features and the attributes of author to take place of the larger redundant lexical frequency features. Moreover, we extract a content based feature vector from different parts of the blog. The dimensionality of the feature vector is reduced by ECE (Expected Cross Entropy) evaluation criterion. We have tested an SVM based splog detector using combinational features on the standard datasets, with excellent filtering efficiency.
Keywords :
entropy; social networking (online); support vector machines; SVM; author attributes; combinational features; content based feature vector extraction; dynamic spam blog filtering algorithm; expected cross entropy evaluation criterion; media social communication mechanism; self-similarity features; spam blog detection; Information systems;
Conference_Titel :
Web Information Systems and Applications Conference (WISA), 2011 Eighth
Conference_Location :
Chongqing
Print_ISBN :
978-1-4577-1812-0
DOI :
10.1109/WISA.2011.23