DocumentCode
1757845
Title
Discovering Emerging Topics in Social Streams via Link-Anomaly Detection
Author
Takahashi, Tatsuro ; Tomioka, Ryota ; Yamanishi, Kenji
Author_Institution
Inst. of Ind. Sci., Univ. of Tokyo, Tokyo, Japan
Volume
26
Issue
1
fYear
2014
fDate
Jan. 2014
Firstpage
120
Lastpage
130
Abstract
Detection of emerging topics is now receiving renewed interest motivated by the rapid growth of social networks. Conventional-term-frequency-based approaches may not be appropriate in this context, because the information exchanged in social-network posts include not only text but also images, URLs, and videos. We focus on emergence of topics signaled by social aspects of theses networks. Specifically, we focus on mentions of users--links between users that are generated dynamically (intentionally or unintentionally) through replies, mentions, and retweets. We propose a probability model of the mentioning behavior of a social network user, and propose to detect the emergence of a new topic from the anomalies measured through the model. Aggregating anomaly scores from hundreds of users, we show that we can detect emerging topics only based on the reply/mention relationships in social-network posts. We demonstrate our technique in several real data sets we gathered from Twitter. The experiments show that the proposed mention-anomaly-based approaches can detect new topics at least as early as text-anomaly-based approaches, and in some cases much earlier when the topic is poorly identified by the textual contents in posts.
Keywords
behavioural sciences; probability; social networking (online); Twitter; anomaly score aggregation; conventional-term-frequency-based approach; emerging topic detection; emerging topic discovery; link-anomaly detection; mention-anomaly-based approach; probability model; social streams; social-network posts; text-anomaly-based approach; Density functional theory; Encoding; Hidden Markov models; Maximum likelihood estimation; Social network services; Training; Topic detection; anomaly detection; burst detection; sequentially discounted normalized maximum-likelihood coding; social networks;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2012.239
Filename
6381411
Link To Document