مرکز منطقه ای اطلاع رساني علوم و فناوري - Research of web information mining by using crawler techniques

DocumentCode :

2702574

Title :

Research of web information mining by using crawler techniques

Author :

Li, Qing-Cheng ; Lin, Shan ; Dong, Zhen-Hua

Author_Institution :

Dept. of Inf. Tech. Sci., Nankai Univ., Tianjin

fYear :

2008

fDate :

20-23 June 2008

Firstpage :

1603

Lastpage :

1607

Abstract :

As the Internet rapidly becomes one of the most important information medium, Web information mining has been the focus of several recent research projects and papers. There are massive documents in certain formats on the Internet while Web crawlers building up with millions of computers scratch the Internet pages every second. Why not combine these two efficiently? This paper describe a new thought that mining Web information by using crawler techniques. After explain the basic principle of crawler techniques, we present the architecture of the new Web information mining system. For the initial test, the system is applied to mine certain standard formatted documents; the experimental data is reported in section IV. By the analysis of the result, we can approve that the system shows high efficiency, flexibility and low cost by using crawler techniques.

Keywords :

Internet; data mining; Internet page; Web information mining system; crawler technique; information medium; Automation; Computer architecture; Costs; Crawlers; Feeds; Fuzzy logic; Internet; Search engines; Web pages; Web sites;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information and Automation, 2008. ICIA 2008. International Conference on

Conference_Location :

Changsha

Print_ISBN :

978-1-4244-2183-1

Electronic_ISBN :

978-1-4244-2184-8

Type :

conf

DOI :

10.1109/ICINFA.2008.4608260

Filename :

4608260

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2702574