DocumentCode
2351704
Title
Describing Web Topics Meticulously through Word Graph Analysis
Author
Sun, Bai ; Shi, Lei ; Kong, Liang ; Zhang, Yan
Author_Institution
Dept. of Machine Intell., Peking Univ., Beijing, China
Volume
2
fYear
2009
fDate
11-14 Oct. 2009
Firstpage
142
Lastpage
147
Abstract
Topic description is as important as topic detection. In this paper, we propose a novel method to describe Web topics with topic words. Under the assumption that representative words exist in important sentences and have high probability of occurrence with other representative words, two graphs are built, one of which represents the relationship for sentences, the other for words. Considering a topic cluster contains a set of different Web pages, sentence clusters are also introduced. Experimental results on a real data set show that our method achieves excellent performance in both high precision and efficiency, especially when real Web data contain mass of noises.
Keywords
Internet; content management; data mining; graph theory; information retrieval; Web pages; Web topics; noise; sentence clusters; topic cluster; topic description; topic detection; topic words; word graph analysis; Broadcasting; Data mining; Frequency; Information analysis; Information retrieval; Information technology; Machine intelligence; Noise reduction; Sun; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer and Information Technology, 2009. CIT '09. Ninth IEEE International Conference on
Conference_Location
Xiamen
Print_ISBN
978-0-7695-3836-5
Type
conf
DOI
10.1109/CIT.2009.55
Filename
5329146
Link To Document