DocumentCode :
2717730
Title :
An Efficient Clustering Algorithm for Microblogging Hot Topic Detection
Author :
Tu, Hao ; Ding, Jin
Author_Institution :
Network & Comput. Center, Huazhong Univ. of Sci. & Tech., Wuhan, China
fYear :
2012
fDate :
11-13 Aug. 2012
Firstpage :
738
Lastpage :
741
Abstract :
Microblog has become exceeding popular, with hundreds of millions of tweets being posted every minute on variety of topics. Most hot event will be retweeted thousands of times in short time, which will help us to trace hot event. This paper focuses on tracing those events by mining the text stream in microblog. Although event detection has long been a research topic, the characteristics of microblog bring new challenge. Tweets reporting such events are usually overwhelmed by high flood of meaningless tweets, algorithm needs to be scalable given the sheer amount of tweets. Firstly, we use Bayes classification to filter the meaningless tweets, then detect hot event from the tweets by a mean calculation based incomplete clustering. The experiments show that algorithm can detect hot events real-time from big amount tweets and remain good accuracy.
Keywords :
Bayes methods; data mining; pattern clustering; social networking (online); text analysis; Bayes classification; Tweets; efficient clustering algorithm; event detection; mean calculation based incomplete clustering; microblogging hot topic detection; text stream mining; Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Event detection; Filtering algorithms; Twitter; clustering algorithm; microblog; topic detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Service System (CSSS), 2012 International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4673-0721-5
Type :
conf
DOI :
10.1109/CSSS.2012.189
Filename :
6394427
Link To Document :
بازگشت