Title :
Feature expansion for Microblogging text based on Latent Dirichlet Allocation with User Feature
Author :
Wei Xia ; Yanxiang He ; Ye Tian ; Qiang Chen ; Lu Lin
Author_Institution :
Sch. of Comput., Wuhan Univ., Wuhan, China
Abstract :
Traditional TDT (Topic Detection and Tracking, TDT) is based on large scale of news stream. However, with the development of new technology, Microblogging platform has become a new generation of platform for information distribution and communication. As many features which are totally different from the common news report exist in Microblogging text, old methods for TDT become ineffective. We present a new framework based on U-LDA (Latent Dirichlet Allocation with User Feature, U-LDA) which considers the user features on the Microblogging platform. We expand the feature of short text on the Microblogging platform by using U-LDA Model, which improves the precision of TDT tasks. In this paper, we discuss and summarize the particular features of Microblogging text, and present a method which considers user features in LDA model, thus we propose a general TDT framework based on U-LDA model. By applying the new model on a Microblogging corpus, we conclude that U-LDA is more effective than LDA.
Keywords :
social networking (online); text analysis; TDT tasks; U-LDA; feature expansion; information distribution and communication; latent Dirichlet allocation with user feature; microblogging corpus; microblogging platform; microblogging text; topic detection and tracking; LDA model; TDT; short text; user features;
Conference_Titel :
Information Technology and Artificial Intelligence Conference (ITAIC), 2011 6th IEEE Joint International
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-8622-9
DOI :
10.1109/ITAIC.2011.6030192