Title :
An Improved Latent Dirichlet Allocation Model for Hot Topic Extraction
Author :
Guolong Liu ; Xiaofei Xu ; Ying Zhu ; Li Li
Author_Institution :
Dept. of Comput. Sci., Southwest Univ., Chongqing, China
Abstract :
Micro blogging is fast becoming a dominant medium in social media and its impact is evident in our daily lives. A massive amount of information is produced on a daily basis. It is observed that detecting hot topics can be very helpful for people to get essential information quickly. But due to short and sparse features, high flood of meaningless tweets and other characteristics of micro blogs, traditional topic detection methods are unable to achieve a desirable level of performance. In this paper, we propose a multi-attribute latent dirichlet allocation (MA-LDA) model, a topic analysis model in which the time and tag attributes of micro blogs are incorporated into LDA model. By introducing a time variable about the time attribute, MA-LDA model can decide whether a word should appear in hot topics or not. Applying tag attribute allows MA-LDA model to rank the core words high in results so that the expressiveness of outcomes can be improved over the traditional LDA model. Empirical evaluation on real data sets demonstrate our method is able to detect hot topics accurately and efficiently with more terms associated with each hot topic found. Our study provides strong evidence of the importance of the temporal factor in hot topics extraction.
Keywords :
natural language processing; social networking (online); MA-LDA; hot topic detection; hot topic extraction; microblogging; multiattribute latent Dirichlet allocation model; social media; tag attributes; temporal factor; time attributes; time variable; topic analysis model; Analytical models; Computational modeling; Feature extraction; Mathematical model; Resource management; Silicon; Standards; Latent Dirichlet Allocation; hot topic detection; microblogs;
Conference_Titel :
Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/BDCloud.2014.55