DocumentCode
2723461
Title
Ontology-Based Focused Crawling
Author
Luong, Hiep Phuc ; Gauch, Susan ; Wang, Qiang
fYear
2009
fDate
1-7 Feb. 2009
Firstpage
123
Lastpage
128
Abstract
Ontology learning has become a major area of research whose goal is to facilitate the construction of ontologies by decreasing the amount of effort required to produce an ontology for a new domain. However, there are few studies that attempt to automate the entire ontology learning process from the collection of domain-specific literature, to text mining to build new ontologies or enrich existing ones. In this paper, we present a framework of ontology learning that enables us to retrieve documents from the Web using focused crawling in a biological domain, amphibian morphology. We use a SVM (support vector machine) classifier to identify domain-specific documents and perform text mining in order to extract useful information for the ontology enrichment process. This paper reports on the overall system architecture and our initial experiments on the focused crawler and document classification.
Keywords
data mining; information retrieval; learning (artificial intelligence); ontologies (artificial intelligence); pattern classification; search engines; support vector machines; text analysis; SVM classifier; Web document retrieval; amphibian morphology; biological domain; document classification; domain-specific document; information extraction; ontology enrichment process; ontology learning; ontology-based focused crawling; support vector machine; text mining; Buildings; Crawlers; Data mining; Humans; Morphology; Ontologies; Semantic Web; Support vector machine classification; Support vector machines; Text mining; SVM; focused crawler; ontology; ontology learning; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Information, Process, and Knowledge Management, 2009. eKNOW '09. International Conference on
Conference_Location
Cancun
Print_ISBN
978-1-4244-3362-9
Electronic_ISBN
978-0-7695-3531-9
Type
conf
DOI
10.1109/eKNOW.2009.26
Filename
4782576
Link To Document