DocumentCode :
3419924
Title :
Automatic classification of Web pages based on the concept of domain ontology
Author :
Song, Mu-Hee ; Lim, Soo-Yeon ; Kang, Dong-Jin ; Lee, Sang-Jo
Author_Institution :
Dept. of Comput. Eng., Kyungpook Nat. Univ., Daegu, South Korea
fYear :
2005
fDate :
15-17 Dec. 2005
Abstract :
The use of ontology in order to provide a mechanism to enable machine reasoning has continuously increased during the last few years. This paper suggests an automated method for document classification using an ontology, which expresses terminology information and vocabulary contained in Web documents by way of a hierarchical structure. Ontology-based document classification involves determining document features that represent the Web documents most accurately, and classifying them into the most appropriate categories after analyzing their contents by using at least two predefined categories per given document features. In this paper, Web pages are classified in real time not with experimental data or a learning process, but by similar calculations between the terminology information extracted from Web pages and ontology categories. This results in a more accurate document classification since the meanings and relationships unique to each document are determined.
Keywords :
Web sites; classification; document handling; ontologies (artificial intelligence); vocabulary; Web document representation; Web page classification; automatic ontology-based document classification; information extraction; machine reasoning; ontology category; vocabulary; Data mining; Dictionaries; Information technology; Internet; Ontologies; Software engineering; Terminology; Vocabulary; Web pages; Document classification; Ontology; Web Page classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering Conference, 2005. APSEC '05. 12th Asia-Pacific
ISSN :
1530-1362
Print_ISBN :
0-7695-2465-6
Type :
conf
DOI :
10.1109/APSEC.2005.46
Filename :
1607205
Link To Document :
بازگشت