DocumentCode
2492109
Title
Detect Events on Noisy Textual Datasets
Author
Yang, Sen ; Cheng, Xueqi ; Chen, You ; Zhang, Jin ; Xu, Hongbo ; Fang, Gaolin
Author_Institution
Inst. of Comput. Technol., CAS, Beijing, China
fYear
2010
fDate
6-8 April 2010
Firstpage
372
Lastpage
374
Abstract
Social media, e.g. Weblog and Internet forum, generate rich historical textual datasets which record lots of valuable events. Automatic event detection tries to discover important and interesting events and their related documents. Existing solutions to event detection, however, are mostly proposed for high quality news stories and may not work well when they are applied to noisy social media datasets, where content quality varies drastically from informative to trivial or even spamming. In this paper, an event detection framework, which directly utilizes burst property of events to filter out noise, is proposed. Experimental results on real dataset from Tencent Internet forum, a popular forum in China, demonstrate the effectiveness of the proposed framework.
Keywords
Web sites; information filtering; text analysis; Tencent Internet forum; Weblog; automatic event detection; burst property; historical textual datasets; noise filtering; noisy textual datasets; social media; Blogs; Computers; Content addressable storage; Discussion forums; Event detection; Information filtering; Information filters; Noise generators; User-generated content; Videos; Event detection; burst property; noisy textual dataset;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Conference (APWEB), 2010 12th International Asia-Pacific
Conference_Location
Busan
Print_ISBN
978-1-7695-4012-2
Electronic_ISBN
978-1-4244-6600-9
Type
conf
DOI
10.1109/APWeb.2010.22
Filename
5474108
Link To Document