• DocumentCode
    1757845
  • Title

    Discovering Emerging Topics in Social Streams via Link-Anomaly Detection

  • Author

    Takahashi, Tatsuro ; Tomioka, Ryota ; Yamanishi, Kenji

  • Author_Institution
    Inst. of Ind. Sci., Univ. of Tokyo, Tokyo, Japan
  • Volume
    26
  • Issue
    1
  • fYear
    2014
  • fDate
    Jan. 2014
  • Firstpage
    120
  • Lastpage
    130
  • Abstract
    Detection of emerging topics is now receiving renewed interest motivated by the rapid growth of social networks. Conventional-term-frequency-based approaches may not be appropriate in this context, because the information exchanged in social-network posts include not only text but also images, URLs, and videos. We focus on emergence of topics signaled by social aspects of theses networks. Specifically, we focus on mentions of users--links between users that are generated dynamically (intentionally or unintentionally) through replies, mentions, and retweets. We propose a probability model of the mentioning behavior of a social network user, and propose to detect the emergence of a new topic from the anomalies measured through the model. Aggregating anomaly scores from hundreds of users, we show that we can detect emerging topics only based on the reply/mention relationships in social-network posts. We demonstrate our technique in several real data sets we gathered from Twitter. The experiments show that the proposed mention-anomaly-based approaches can detect new topics at least as early as text-anomaly-based approaches, and in some cases much earlier when the topic is poorly identified by the textual contents in posts.
  • Keywords
    behavioural sciences; probability; social networking (online); Twitter; anomaly score aggregation; conventional-term-frequency-based approach; emerging topic detection; emerging topic discovery; link-anomaly detection; mention-anomaly-based approach; probability model; social streams; social-network posts; text-anomaly-based approach; Density functional theory; Encoding; Hidden Markov models; Maximum likelihood estimation; Social network services; Training; Topic detection; anomaly detection; burst detection; sequentially discounted normalized maximum-likelihood coding; social networks;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.239
  • Filename
    6381411