DocumentCode :
1987987
Title :
Information mining system design and implementation based on web crawler
Author :
Lin, Shan ; Li, You-meng ; Li, Qing-Cheng
Author_Institution :
Coll. of Inf. Tech. Sci., Nankai Univ., Tianjin
fYear :
2008
fDate :
2-4 June 2008
Firstpage :
1
Lastpage :
5
Abstract :
With the information explosion causing by the World Wide Web in recent years, the issue of how to execute the enormous information efficiently at a reasonable lost has become the concern of information providers, service agencies and end users. When many research focus on how to design an efficient Web crawler, we pay our attention to how to make the best of the result of Web crawler. In this paper, we describe the design and implementation of an information mining system running on the results of Web crawler to gain more metadata from unstructured documents for focused search (such as RSS search). We present the software architecture of the system, describe efficient techniques for achieving high performance and report preliminary experimental results to prove that this system can address the issue of robustness, flexibility and accuracy at a low cost.
Keywords :
Internet; data mining; document handling; information retrieval; meta data; software architecture; Web crawler; World Wide Web; information mining system; information provider; metadata; service agency; software architecture; Costs; Crawlers; Data mining; Educational institutions; Electronic mail; Fuzzy logic; Internet; Search engines; Web pages; Web sites; Crawler; RSS; information mining; low cost;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System of Systems Engineering, 2008. SoSE '08. IEEE International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-2172-5
Electronic_ISBN :
978-1-4244-2173-2
Type :
conf
DOI :
10.1109/SYSOSE.2008.4724148
Filename :
4724148
Link To Document :
بازگشت