DocumentCode :
2702574
Title :
Research of web information mining by using crawler techniques
Author :
Li, Qing-Cheng ; Lin, Shan ; Dong, Zhen-Hua
Author_Institution :
Dept. of Inf. Tech. Sci., Nankai Univ., Tianjin
fYear :
2008
fDate :
20-23 June 2008
Firstpage :
1603
Lastpage :
1607
Abstract :
As the Internet rapidly becomes one of the most important information medium, Web information mining has been the focus of several recent research projects and papers. There are massive documents in certain formats on the Internet while Web crawlers building up with millions of computers scratch the Internet pages every second. Why not combine these two efficiently? This paper describe a new thought that mining Web information by using crawler techniques. After explain the basic principle of crawler techniques, we present the architecture of the new Web information mining system. For the initial test, the system is applied to mine certain standard formatted documents; the experimental data is reported in section IV. By the analysis of the result, we can approve that the system shows high efficiency, flexibility and low cost by using crawler techniques.
Keywords :
Internet; data mining; Internet page; Web information mining system; crawler technique; information medium; Automation; Computer architecture; Costs; Crawlers; Feeds; Fuzzy logic; Internet; Search engines; Web pages; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Automation, 2008. ICIA 2008. International Conference on
Conference_Location :
Changsha
Print_ISBN :
978-1-4244-2183-1
Electronic_ISBN :
978-1-4244-2184-8
Type :
conf
DOI :
10.1109/ICINFA.2008.4608260
Filename :
4608260
Link To Document :
بازگشت