DocumentCode
3107001
Title
Detecting Link Spam Using Temporal Information
Author
Shen, Guoyang ; Gao, Bin ; Liu, Tie-Yan ; Feng, Guang ; Song, Shiji ; Li, Hang
Author_Institution
Microsoft Res. Asia 4F, Beijing
fYear
2006
fDate
18-22 Dec. 2006
Firstpage
1049
Lastpage
1053
Abstract
How to effectively protect against spam on search ranking results is an important issue for contemporary web search engines. This paper addresses the problem of combating one major type of web spam: ´link spam.´ Most of the previous work on anti link spam managed to make use of one snapshot of web data to detect spam, and thus it did not take advantage of the fact that link spam tends to result in drastic changes of links in a short time period. To overcome the shortcoming, this paper proposes using temporal information on links in detection of link spam, as well as other information. Specifically, it defines temporal features such as in-link growth rate (IGR) and in-link death rate (IDR) in a spam classification model (i.e., SVM). Experimental results on web domain graph data show that link spam can be successfully detected with the proposed method.
Keywords
search engines; unsolicited e-mail; IDR; IGR; Web spam; contemporary Web search engines; in-link death rate; in-link growth rate; link spam detection; temporal information; Asia; Data engineering; Data mining; Robustness; Search engines; Support vector machine classification; Support vector machines; Unsolicited electronic mail; Web pages; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location
Hong Kong
ISSN
1550-4786
Print_ISBN
0-7695-2701-7
Type
conf
DOI
10.1109/ICDM.2006.51
Filename
4053151
Link To Document