Title :
Internet medicine information monitoring system based on focused crawler
Author :
Yan, Hong-yi ; Hao, Ping
Author_Institution :
Coll. of Inf. Eng., Zhejiang Univ. of Technol., Hangzhou, China
Abstract :
Aiming at the problem, which it is difficult to monitor the medicine trade information on Internet, proposed a combined strategy that searched specific topic on the Internet based on analyzing focused crawler´s searching algorithm. The combined strategy includes page-searching and relativity analysis. Page relativity algorithm adopts improved Fish-Search algorithm; Relativity analysis adopts distributed algorithm, hereinto the first step makes use of Vector space model (VSM) algorithm to find out the great topic in the rough. The second step adopts improved Native bayes classification algorithm to select the correlative small topic from the previous step´s result. On basis of researching, develops an information monitoring system facing the medicine on Internet. By testing the data of some websites and forums´ page, the result shows, the combined searching strategy improves the harvest ratio and small topic search´s efficiency of the focused crawler system.
Keywords :
Algorithm design and analysis; Computer science; Computerized monitoring; Crawlers; Educational institutions; Functional analysis; Information analysis; Internet; System testing; Uniform resource locators; Distributed relativity algorithm; Fish-Search algorithm; Focused crawler;
Conference_Titel :
Information Sciences and Interaction Sciences (ICIS), 2010 3rd International Conference on
Conference_Location :
Chengdu, China
Print_ISBN :
978-1-4244-7384-7
Electronic_ISBN :
978-1-4244-7386-1
DOI :
10.1109/ICICIS.2010.5534784