DocumentCode :
1946177
Title :
A Web warehouse system for semi-automatically gathering and managing online news
Author :
Cheng, Kai ; Wang, Hanfei
Author_Institution :
Dept. of Social Inf. Syst., Kyushu Sangyo Univ., Japan
fYear :
2005
fDate :
19-21 May 2005
Firstpage :
343
Lastpage :
344
Abstract :
In this paper we propose a Web warehouse system that gathers and manages online news in a semi-automatic fashion, serving as intermediate information repository for a given user community. We describe its architecture and an ontology-based, focused crawler for automatically collecting relevant news documents. We further discuss the problem of efficient management of the hit frequency profile for all visited news stories and propose a randomized data structure, ABF-Aging Bloom Filter, to cope with this problem. We demonstrate that the proposed system can save a good deal of Web traffic and online time when individual users try to search and retrieve the relevant online news.
Keywords :
Internet; data structures; data warehouses; information retrieval; online front-ends; ontologies (artificial intelligence); search engines; Aging Bloom Filter; Web traffic; Web warehouse system; hit frequency profile; intermediate information repository; online news retrieval; ontology-based architecture; randomized data structure; search engines; Aging; Computer architecture; Crawlers; Data structures; Filters; Frequency; Information science; Ontologies; Search engines; Service oriented architecture;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Active Media Technology, 2005. (AMT 2005). Proceedings of the 2005 International Conference on
Print_ISBN :
0-7803-9035-0
Type :
conf
DOI :
10.1109/AMT.2005.1505366
Filename :
1505366
Link To Document :
بازگشت