Title :
WESPACT: — Detection of web spamdexing with decision trees in GA perspective
Author :
Jayanthi, S.K. ; Sasikala, S.
Author_Institution :
Comput. Sci. Dept., Vellalar Coll. for Women, Erode, India
Abstract :
Internet today is huge, dynamic, self-organized, and strongly interlinked. Web spam can significantly worsen the quality of search engine results. The motivation of the paper is based on the logical perspective of approaching the web spam problem as cancer caused to the internet, and the solution could be derived by formulating the algorithms based on genetic algorithm (GA) based on content and link attributes. Web mining tools GATree [15] and PermutMatrix [14] has been used to simulate the experiments. JAVA is used to develop program that analyze and report the spamdexing instance. This paper proposes an algorithm WESPACT, to detect the web spam. This algorithm performs well as shown through experiments.
Keywords :
Internet; Java; data mining; decision trees; genetic algorithms; search engines; security of data; unsolicited e-mail; GA; GA perspective; Internet; JAVA; PermutMatrix; WESPACT; Web mining tools GATree; Web spamdexing detection; content attributes; decision trees; genetic algorithm; link attributes; search engine result quality; Cancer; Classification algorithms; Computer aided software engineering; Genetic algorithms; Informatics; Pattern recognition; Search engines; Cancer; Content spam; Link spam; Search engine; WESPACT;
Conference_Titel :
Pattern Recognition, Informatics and Medical Engineering (PRIME), 2012 International Conference on
Conference_Location :
Salem, Tamilnadu
Print_ISBN :
978-1-4673-1037-6
DOI :
10.1109/ICPRIME.2012.6208376