DocumentCode :
1992486
Title :
Research on discovering Deep Web entries based ontopic crawling and ontology
Author :
Liu, Gang ; Liu, Kai ; Dang, Yuan-yuan
Author_Institution :
Sch. of Comput. Sci. & Eng., Changchun Univ. of Technol., Changchun, China
fYear :
2011
fDate :
16-18 Sept. 2011
Firstpage :
2488
Lastpage :
2490
Abstract :
Abstract-With the wide application of web database, Web pages are continuously deepened. In order to use Deep Web resources effectively, the Deep Web data need to be integrated on a large scale, and data sources discover is the primary work of Deep Web resources integration. The Deep Web site found efficiently is the key of the Deep Web data Integration. This paper puts forward a kind of Deep Web entry automatic discovery method. In this paper, firstly using the information of specific field Deep Web entry form to establish domain ontology, then web forms can be judged by the process of the topic crawler crawling in the web. If there are forms which are extracted its attributes and calculate the weights from form´s attributes and ontology. Download this page when the weights greater than the fixed value. Finally we use test words to examine the already download pages to find out high quality Deep Web entry pages.
Keywords :
Internet; Web sites; data mining; ontologies (artificial intelligence); Deep Web data Integration; Deep Web entry automatic discovery method; Deep Web resource; Deep Web site; Web database; Web page; attribute extraction; data sources discover; domain ontology; ontopic crawling; Accuracy; Bayesian methods; Correlation; Crawlers; Databases; HTML; Ontologies; Bayes Classifier; Data Source Diseovery; Deep Web; Ontology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical and Control Engineering (ICECE), 2011 International Conference on
Conference_Location :
Yichang
Print_ISBN :
978-1-4244-8162-0
Type :
conf
DOI :
10.1109/ICECENG.2011.6057954
Filename :
6057954
Link To Document :
بازگشت