• DocumentCode
    2492109
  • Title

    Detect Events on Noisy Textual Datasets

  • Author

    Yang, Sen ; Cheng, Xueqi ; Chen, You ; Zhang, Jin ; Xu, Hongbo ; Fang, Gaolin

  • Author_Institution
    Inst. of Comput. Technol., CAS, Beijing, China
  • fYear
    2010
  • fDate
    6-8 April 2010
  • Firstpage
    372
  • Lastpage
    374
  • Abstract
    Social media, e.g. Weblog and Internet forum, generate rich historical textual datasets which record lots of valuable events. Automatic event detection tries to discover important and interesting events and their related documents. Existing solutions to event detection, however, are mostly proposed for high quality news stories and may not work well when they are applied to noisy social media datasets, where content quality varies drastically from informative to trivial or even spamming. In this paper, an event detection framework, which directly utilizes burst property of events to filter out noise, is proposed. Experimental results on real dataset from Tencent Internet forum, a popular forum in China, demonstrate the effectiveness of the proposed framework.
  • Keywords
    Web sites; information filtering; text analysis; Tencent Internet forum; Weblog; automatic event detection; burst property; historical textual datasets; noise filtering; noisy textual datasets; social media; Blogs; Computers; Content addressable storage; Discussion forums; Event detection; Information filtering; Information filters; Noise generators; User-generated content; Videos; Event detection; burst property; noisy textual dataset;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Conference (APWEB), 2010 12th International Asia-Pacific
  • Conference_Location
    Busan
  • Print_ISBN
    978-1-7695-4012-2
  • Electronic_ISBN
    978-1-4244-6600-9
  • Type

    conf

  • DOI
    10.1109/APWeb.2010.22
  • Filename
    5474108