DocumentCode :
1837114
Title :
Web Spam Detection Using Link-Based Ant Colony Optimization
Author :
Taweesiriwate, Apichat ; Manaskasemsak, Bundit ; Rungsawang, Arnon
Author_Institution :
Fac. of Eng., Dept. of Comput. Eng., Kasetsart Univ., Bangkok, Thailand
fYear :
2012
fDate :
26-29 March 2012
Firstpage :
868
Lastpage :
873
Abstract :
Web spam is one of the most important problems which degrade quality and efficiency of web search engines. In this paper, we present a novel link-based ant colony optimization learning algorithm for spam host detection. The host graph is first constructed by aggregating pages´ hyperlink structure. Following the Trust Rank assumption, ants start walking from a normal host and randomly follow host links with a probability distribution. Then, the classification rules are appropriately generated according to common features of normal hosts sequentially discovered by ants. From the experiments with the WEBSPAM-UK2006 dataset, the proposed learning model provides much accuracy in classifying both normal and spam hosts than several baselines, including a state of the art C4.5. Moreover, we also provide an analysis in parameter tuning for better results.
Keywords :
Internet; ant colony optimisation; learning (artificial intelligence); pattern classification; search engines; security of data; statistical distributions; trusted computing; unsolicited e-mail; WEBSPAM-UK2006 dataset; Web search engine; Web spam detection; classification rule; host graph; host link; hyperlink structure; learning model; link-based ant colony optimization learning algorithm; parameter tuning; probability distribution; spam host detection; trust rank assumption; Accuracy; Ant colony optimization; Classification algorithms; Feature extraction; Legged locomotion; Training; Web pages; ant colony optimization algorithm; content spam; link spam; spam detection; web link structure; web spam;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications (AINA), 2012 IEEE 26th International Conference on
Conference_Location :
Fukuoka
ISSN :
1550-445X
Print_ISBN :
978-1-4673-0714-7
Type :
conf
DOI :
10.1109/AINA.2012.118
Filename :
6184960
Link To Document :
بازگشت