مرکز منطقه ای اطلاع رساني علوم و فناوري - An Analysis of Machine Learning Methods for Spam Host Detection

DocumentCode :

589315

Title :

An Analysis of Machine Learning Methods for Spam Host Detection

Author :

Silva, Ricardo M. ; Yamakami, Akebo ; Almeida, Tiago A.

Author_Institution :

Sch. of Electr. & Comput. Eng., Univ. of Campinas-UNICAMP, Campinas, Brazil

Volume :

fYear :

2012

fDate :

12-15 Dec. 2012

Firstpage :

227

Lastpage :

232

Abstract :

The web is becoming an increasingly important source of entertainment, communication, research, news and trade. In this way, the web sites compete to attract the attention of users and many of them achieve visibility through malicious strategies that try to circumvent the search engines. Such sites are known as web spam and they are generally responsible for personal injury and economic losses. Given this scenario, this paper presents a comprehensive performance evaluation of several established machine learning techniques used to automatically detect and filter hosts that disseminate web spam. Our experiments were diligently designed to ensure statistically sounds results and they indicate that bagging of decision trees, multilayer perceptron neural networks, random forest and adaptive boosting of decision trees are promising in the task of web spam classification and, hence, they can be used as a good baseline for further comparison.

Keywords :

Web sites; decision trees; learning (artificial intelligence); multilayer perceptrons; pattern classification; search engines; security of data; unsolicited e-mail; Web sites; Web spam classification; comprehensive performance evaluation; decision trees adaptive boosting; machine learning methods; machine learning techniques; malicious strategy; multilayer perceptron neural networks; random forest; search engines; spam host detection; Bagging; Boosting; Kernel; Neurons; Search engines; Support vector machines; Unsolicited electronic mail; classification; spam host; spamdexing; web spam;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Applications (ICMLA), 2012 11th International Conference on

Conference_Location :

Boca Raton, FL

Print_ISBN :

978-1-4673-4651-1

Type :

conf

DOI :

10.1109/ICMLA.2012.161

Filename :

6406755

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=589315