Towards automatic document classification by exploiting only knowledge resources

Author

Gleidson Antonio Cardoso da Silva;Carina F. Dorneles

Author_Institution

Federal University of Santa Catarina, Florianopolis, Brasil

fYear

2015

Firstpage

Lastpage

Abstract

Document classification is critical to optimize information retrieval tasks, especially over the web. In this environment, the open domain nature and growing volume of available data remain a challenge for the classification task. In this paper, we deal with these problems by only using knowledge resources. Our approach relies on concepts instances derived from the document and an open domain knowledge base for concept generalization. The set of broader concepts is ranked according to a disparity value, and then the best-placed concept is considered as the document class label. Experimental results on real-world datasets show that this approach can achieve document classification without the need to build an ontology or train and keep a classification model.

Keywords

"Knowledge based systems","Ontologies","Training","Proposals","Semantics","Informatics","Information retrieval"

Publisher

ieee

Conference_Titel

Chilean Computer Science Society (SCCC), 2015 34th International Conference of the

Type

conf

DOI

10.1109/SCCC.2015.7416573

Filename

7416573

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3752790