Title :
Topic Detection and Tracking for Threaded Discussion Communities
Author :
Zhu, Mingliang ; Hu, Weiming ; Wu, Ou
Author_Institution :
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., Beijing
Abstract :
The threaded discussion communities are one of the most common forms of online communities, which are becoming more and more popular among web users. Everyday a huge amount of new discussions are added to these communities, which are difficult to summarize and search. In this paper, we propose a topic detection and tracking (TDT) method for the discussion threads. Most existing TDT methods deal with the news stories, but the language used in discussion data are much more casual, oral and informal compared with news data. To solve this problem, we design several extensions to the basic TDT framework, focusing on the very nature of discussion data, including a thread/post activity validation step, a term pos-weighting strategy, and a two-level decision framework considering not only the content similarity but also the user activity information. Experiment results show that our pro-posed method greatly improves current TDT methods in real discussion community environment. The discussion data can be better organized for searching and visualization with the help of TDT.
Keywords :
Internet; information analysis; Web users; content similarity; discussion data; online communities; threaded discussion communities; topic detection; tracking method; two-level decision framework; user activity information; Automation; Communities; Data visualization; Discussion forums; Intelligent agent; Internet; Laboratories; Pattern recognition; Search engines; Yarn; Authorship Analysis; Content Analysis; Online Discussion Community; Topic Detection and Tracking;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
DOI :
10.1109/WIIAT.2008.50