Title :
Identifying spam tags by mining tag semantics
Author :
Yang, Hsin-Chang ; Lee, Chung-Hong
Author_Institution :
National University of Kaohsiung, Kaohsiung, Taiwan
Abstract :
Social bookmarking or tagging Web sites, or folk-sonomies, emerge recently since the result of traditional context-driven search of Web pages could be improved by user-annotated tags. People could retrieve, organize, and comprehend Web pages thorough such tags. However, spam tags were significantly increased and deteriorated the effectiveness of social tagging. Tags unrelated to the content of the annotated Web pages were added intentionally to improperly manipulate the ranking or retrieval result for malicious purposes. Therefore approaches for filtering and identifying spams in various granularity were proposed to conquer such deficiency. Traditional approaches most focused on user-level detection which tries to differentiate spammers from normal annotators. Such approaches may suffer from the fact that misclassification of spammers as well as spam tags often happens since the intention and behavior of spammers change dynamically. It will be better if we can detect spams in much precise ways such as individual posts or tags. In this work we propose a scheme to identify individual tag spams according to the semantic relatedness between a tag and its annotated page. We first cluster Web pages and tags separately to reveal the relationships among pages and tags, respectively. A relationship discovery process was then applied to find the relationships between a page cluster and a tag cluster. We also devised a measurement to measure the semantic relatedness between individual tag and page. Individual spam tags were then identified according to such relatedness. We conducted experiments on ECML/PKDD RSDC 2008 dataset and obtained promising result.
Keywords :
Filtering; Tagging; Unsolicited electronic mail; Web pages;
Conference_Titel :
Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on
Conference_Location :
Macao
Print_ISBN :
978-1-4673-0231-9