• DocumentCode
    559690
  • Title

    Identifying spam tags by mining tag semantics

  • Author

    Yang, Hsin-Chang ; Lee, Chung-Hong

  • Author_Institution
    National University of Kaohsiung, Kaohsiung, Taiwan
  • fYear
    2011
  • fDate
    24-26 Oct. 2011
  • Firstpage
    263
  • Lastpage
    268
  • Abstract
    Social bookmarking or tagging Web sites, or folk-sonomies, emerge recently since the result of traditional context-driven search of Web pages could be improved by user-annotated tags. People could retrieve, organize, and comprehend Web pages thorough such tags. However, spam tags were significantly increased and deteriorated the effectiveness of social tagging. Tags unrelated to the content of the annotated Web pages were added intentionally to improperly manipulate the ranking or retrieval result for malicious purposes. Therefore approaches for filtering and identifying spams in various granularity were proposed to conquer such deficiency. Traditional approaches most focused on user-level detection which tries to differentiate spammers from normal annotators. Such approaches may suffer from the fact that misclassification of spammers as well as spam tags often happens since the intention and behavior of spammers change dynamically. It will be better if we can detect spams in much precise ways such as individual posts or tags. In this work we propose a scheme to identify individual tag spams according to the semantic relatedness between a tag and its annotated page. We first cluster Web pages and tags separately to reveal the relationships among pages and tags, respectively. A relationship discovery process was then applied to find the relationships between a page cluster and a tag cluster. We also devised a measurement to measure the semantic relatedness between individual tag and page. Individual spam tags were then identified according to such relatedness. We conducted experiments on ECML/PKDD RSDC 2008 dataset and obtained promising result.
  • Keywords
    Filtering; Tagging; Unsolicited electronic mail; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on
  • Conference_Location
    Macao
  • Print_ISBN
    978-1-4673-0231-9
  • Type

    conf

  • Filename
    6108441