Document Classification Based on Support Vector Machine Using a Concept Vector Model

Author

Deng, Shuang ; Peng, Hong

Author_Institution

Sch. of Math. & Comput. Sci., Xihua Univ., Sichuan

fYear

2006

fDate

18-22 Dec. 2006

Firstpage

473

Lastpage

476

Abstract

This paper proposes a new method for document categorization, based on support vector machine (SVM) using a concept vector model (CVM). The traditional document classification usually ignores the semantic relations among the keywords or documents. To effectively solve the semantic problem, the domain ontology is used to capture the semantic information among different terms or keywords in the documents. Using the concept vector model, domain-related semantic information more exactly from documents can be extracted. In the model, concept vector is extracted from a document by the matching method. According to concept features of the documents, documents are classified into a suitable category by SVM. The experimental results show that our CVM method yields higher accuracy compared to the traditional term-based vector space model (VSM) methods

Keywords

classification; information retrieval; ontologies (artificial intelligence); semantic networks; support vector machines; SVM; concept vector model; document categorization; document classification; domain ontology; domain-related semantic information; support vector machine; Classification tree analysis; Computer science; Data mining; Decision trees; Mathematical model; Mathematics; Ontologies; Probability; Support vector machine classification; Support vector machines;

fLanguage

English

Publisher

ieee

Conference_Titel

Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on

Conference_Location

Hong Kong

Print_ISBN

0-7695-2747-7

Type

conf

DOI

10.1109/WI.2006.65

Filename

4061413