DocumentCode
610359
Title
A unified model for stable and temporal topic detection from social media data
Author
Hongzhi Yin ; Bin Cui ; Hua Lu ; Yuxin Huang ; Junjie Yao
Author_Institution
Dept. of Comput. Sci. & Technol., Peking Univ., Beijing, China
fYear
2013
fDate
8-12 April 2013
Firstpage
661
Lastpage
672
Abstract
Web 2.0 users generate and spread huge amounts of messages in online social media. Such user-generated contents are mixture of temporal topics (e.g., breaking events) and stable topics (e.g., user interests). Due to their different natures, it is important and useful to distinguish temporal topics from stable topics in social media. However, such a discrimination is very challenging because the user-generated texts in social media are very short in length and thus lack useful linguistic features for precise analysis using traditional approaches. In this paper, we propose a novel solution to detect both stable and temporal topics simultaneously from social media data. Specifically, a unified user-temporal mixture model is proposed to distinguish temporal topics from stable topics. To improve this model´s performance, we design a regularization framework that exploits prior spatial information in a social network, as well as a burst-weighted smoothing scheme that exploits temporal prior information in the time dimension. We conduct extensive experiments to evaluate our proposal on two real data sets obtained from Del.icio.us and Twitter. The experimental results verify that our mixture model is able to distinguish temporal topics from stable topics in a single detection process. Our mixture model enhanced with the spatial regularization and the burst-weighted smoothing scheme significantly outperforms competitor approaches, in terms of topic detection accuracy and discrimination in stable and temporal topics.
Keywords
Internet; information retrieval; linguistics; social networking (online); text analysis; Del.icio.us; Twitter; UGC; Web 2.0; burst-weighted smoothing scheme; linguistic features; online social media; spatial regularization; stable topic detection; temporal topic detection; user-generated contents; user-temporal mixture model; Equations; Feature extraction; Hidden Markov models; Mathematical model; Media; Twitter;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
Conference_Location
Brisbane, QLD
ISSN
1063-6382
Print_ISBN
978-1-4673-4909-3
Electronic_ISBN
1063-6382
Type
conf
DOI
10.1109/ICDE.2013.6544864
Filename
6544864
Link To Document