DocumentCode
2352697
Title
Topic Detection and Tracking for Chinese News Web Pages
Author
Qiu, Jing ; Liao, Lejian ; Dong, Xiujie
Author_Institution
Sch. of Comput. Sci., Beijing Inst. of Technol., Beijing
fYear
2008
fDate
23-25 July 2008
Firstpage
114
Lastpage
120
Abstract
With the continuous growth in the number of available Web news sites and the diversity in their presentation of content, there is an increasing need in mining the news correlation on the Web to keep tracking of successive development of specific event. In this paper a new approach of topic tracking of Chinese news Web pages is presented. Temporal information extracted from news texts and "key Web contexts" extracted from HTML documents is used to improve the performance of dependency structure language model (DSLM). Experimental results are examined that shows the usefulness of our approach.
Keywords
Web sites; information retrieval; Chinese news Web pages; HTML documents; Web news correlation mining; Web news sites; dependency structure language model; key Web contexts; news texts; temporal information extraction; topic detection; Computer science; Context modeling; Data mining; Electronic mail; Event detection; HTML; Information technology; Laboratories; Natural languages; Web pages; Topic tracking; content extraction; dependency structure language model; temporal information extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on
Conference_Location
Dalian Liaoning
Print_ISBN
978-0-7695-3273-8
Type
conf
DOI
10.1109/ALPIT.2008.31
Filename
4584352
Link To Document