DocumentCode
480673
Title
Topic Detection and Tracking for Threaded Discussion Communities
Author
Zhu, Mingliang ; Hu, Weiming ; Wu, Ou
Author_Institution
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., Beijing
Volume
1
fYear
2008
fDate
9-12 Dec. 2008
Firstpage
77
Lastpage
83
Abstract
The threaded discussion communities are one of the most common forms of online communities, which are becoming more and more popular among web users. Everyday a huge amount of new discussions are added to these communities, which are difficult to summarize and search. In this paper, we propose a topic detection and tracking (TDT) method for the discussion threads. Most existing TDT methods deal with the news stories, but the language used in discussion data are much more casual, oral and informal compared with news data. To solve this problem, we design several extensions to the basic TDT framework, focusing on the very nature of discussion data, including a thread/post activity validation step, a term pos-weighting strategy, and a two-level decision framework considering not only the content similarity but also the user activity information. Experiment results show that our pro-posed method greatly improves current TDT methods in real discussion community environment. The discussion data can be better organized for searching and visualization with the help of TDT.
Keywords
Internet; information analysis; Web users; content similarity; discussion data; online communities; threaded discussion communities; topic detection; tracking method; two-level decision framework; user activity information; Automation; Communities; Data visualization; Discussion forums; Intelligent agent; Internet; Laboratories; Pattern recognition; Search engines; Yarn; Authorship Analysis; Content Analysis; Online Discussion Community; Topic Detection and Tracking;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location
Sydney, NSW
Print_ISBN
978-0-7695-3496-1
Type
conf
DOI
10.1109/WIIAT.2008.50
Filename
4740429
Link To Document