Title :
Web archiving strategies by using Web mining techniques
Author :
Kawano, Hiroyuki
Author_Institution :
Dept. of Syst. Sci., Kyoto Univ., Japan
Abstract :
For preserving huge volume of born-digital information in the Internet, national diet library in Japan has been developing a experimental web archiving system, WARP (http://warp.ndl.go.jp/). However, in order to handle monotonously increasing digital information, we consider many difficult problems of long life data preservation from various technical aspects. In this paper, we try to apply web mining techniques to web archiving strategies. Our strategies are based on the experiences of our Mondou web search engine and web robots, which are based on text/web mining technologies.
Keywords :
Internet; information storage; search engines; Internet; Mondou web search engine; Web archiving strategies; Web mining techniques; digital information storage; national diet library; web robots; Data mining; Data visualization; Database systems; Internet; Robots; Search engines; Software libraries; Web mining; Web search; Web server;
Conference_Titel :
Communications, Computers and signal Processing, 2003. PACRIM. 2003 IEEE Pacific Rim Conference on
Print_ISBN :
0-7803-7978-0
DOI :
10.1109/PACRIM.2003.1235932