• DocumentCode
    2352697
  • Title

    Topic Detection and Tracking for Chinese News Web Pages

  • Author

    Qiu, Jing ; Liao, Lejian ; Dong, Xiujie

  • Author_Institution
    Sch. of Comput. Sci., Beijing Inst. of Technol., Beijing
  • fYear
    2008
  • fDate
    23-25 July 2008
  • Firstpage
    114
  • Lastpage
    120
  • Abstract
    With the continuous growth in the number of available Web news sites and the diversity in their presentation of content, there is an increasing need in mining the news correlation on the Web to keep tracking of successive development of specific event. In this paper a new approach of topic tracking of Chinese news Web pages is presented. Temporal information extracted from news texts and "key Web contexts" extracted from HTML documents is used to improve the performance of dependency structure language model (DSLM). Experimental results are examined that shows the usefulness of our approach.
  • Keywords
    Web sites; information retrieval; Chinese news Web pages; HTML documents; Web news correlation mining; Web news sites; dependency structure language model; key Web contexts; news texts; temporal information extraction; topic detection; Computer science; Context modeling; Data mining; Electronic mail; Event detection; HTML; Information technology; Laboratories; Natural languages; Web pages; Topic tracking; content extraction; dependency structure language model; temporal information extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on
  • Conference_Location
    Dalian Liaoning
  • Print_ISBN
    978-0-7695-3273-8
  • Type

    conf

  • DOI
    10.1109/ALPIT.2008.31
  • Filename
    4584352