DocumentCode :
1679733
Title :
Research and Implementation of Web Structure-Based News Gathering System
Author :
Chen, Jianguo ; Lu, Minrong ; Ke, XiaoYu
Author_Institution :
Software Sch., Hunan Univ., Changsha, China
fYear :
2011
Firstpage :
502
Lastpage :
505
Abstract :
On the basis of depth studying the technology of web information gathering, a web structure-based news gathering model is proposed. Firstly, it load the gathering entry address, find the news list page with the Information Gathering and Filter Algorithm, then identify and improve the news content page link address according to the rules set by acquisition and combined with regular expression technology automatically, and then load the target page-news content page, gather the news information with the algorithm automatically. At the same time, it can filter any information that is set in this page such as embedded advertising messages. Practical results show that the proposed model works well, it can gather news information efficiently and automatically.
Keywords :
Internet; information filtering; Web structure based news gathering system; filter algorithm; information gathering; news content page; Algorithm design and analysis; Data mining; Educational institutions; Filtering algorithms; Information filters; Web pages; News Gathering; Regular Expressions; Web Gathering; Web Structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Future Computer Science and Education (ICFCSE), 2011 International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-1-4577-1562-4
Type :
conf
DOI :
10.1109/ICFCSE.2011.128
Filename :
6041745
Link To Document :
بازگشت