• DocumentCode
    3006521
  • Title

    Duplicate Detection for Identifying Social Spam in Microblogs

  • Author

    Qunyan Zhang ; Haixin Ma ; Weining Qian ; Aoying Zhou

  • Author_Institution
    Center for Cloud Comput. & Big Data, East China Normal Univ., Shanghai, China
  • fYear
    2013
  • fDate
    June 27 2013-July 2 2013
  • Firstpage
    141
  • Lastpage
    148
  • Abstract
    As an important kind of social media, microblog has become an important source of opinion mining and collective behavior study. However, social spams may affect the analytical results greatly. This paper focuses on the problem of identifying potential social spammers who copy pieces of information from others. An improved locality-sensitive hashing based method is used for detecting duplicated tweets. Intensive empirical study over a real-life microblog dataset crawled from Sina Weibo, one of the most popular microblogging services, is conducted. The characteristics of potential spammers and their behaviors are analyzed.
  • Keywords
    cryptography; data mining; social networking (online); unsolicited e-mail; Sina Weibo; collective behavior study; locality-sensitive hashing based method; microblogging services; opinion mining; real-life microblog dataset; social media; social spam identification; tweet duplicate detection; Crawlers; Data handling; Information management; Media; Robots; Twitter; Unsolicited electronic mail; MapReduce; duplicate detection; locality-sensitive hash; microblog; social spam;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (BigData Congress), 2013 IEEE International Congress on
  • Conference_Location
    Santa Clara, CA
  • Print_ISBN
    978-0-7695-5006-0
  • Type

    conf

  • DOI
    10.1109/BigData.Congress.2013.27
  • Filename
    6597130