DocumentCode
2533712
Title
Dynamic Splog Filtering algorithm Based on Combinational Features
Author
Ren, Yong-gong ; Yin, Ming-fei ; Wang, Jian
Author_Institution
Sch. of Comput. & Inf. Technol., Liaoning Normal Univ., Dalian, China
fYear
2011
fDate
21-23 Oct. 2011
Firstpage
82
Lastpage
85
Abstract
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms. The existing algorithms of identifying splogs based on lexical frequency features which are quite redundancy and lack correlation, degrades blog search results as well as wastes network resources. In our approach we exploit a dynamic filtering algorithm based on the combinational features of splog(CFDS) to detect splogs. CFDS algorithm selects several efficient novel features such as self- similarity features and the attributes of author to take place of the larger redundant lexical frequency features. Moreover, we extract a content based feature vector from different parts of the blog. The dimensionality of the feature vector is reduced by ECE (Expected Cross Entropy) evaluation criterion. We have tested an SVM based splog detector using combinational features on the standard datasets, with excellent filtering efficiency.
Keywords
entropy; social networking (online); support vector machines; SVM; author attributes; combinational features; content based feature vector extraction; dynamic spam blog filtering algorithm; expected cross entropy evaluation criterion; media social communication mechanism; self-similarity features; spam blog detection; Information systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information Systems and Applications Conference (WISA), 2011 Eighth
Conference_Location
Chongqing
Print_ISBN
978-1-4577-1812-0
Type
conf
DOI
10.1109/WISA.2011.23
Filename
6093608
Link To Document