DocumentCode
2850629
Title
Text classification by boosting weak learners based on terms and concepts
Author
Bloehdorn, Stephan ; Hotho, Andreas
Author_Institution
Inst. AIFB, Karlsruhe Univ., Germany
fYear
2004
fDate
1-4 Nov. 2004
Firstpage
331
Lastpage
334
Abstract
Document representations for text classification are typically based on the classical bag-of-words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
Keywords
classification; ontologies (artificial intelligence); text analysis; bag-of-words paradigm; document representations; text classification; weak learner boosting; Boosting; Data engineering; Data mining; Frequency; Information retrieval; Knowledge management; Learning systems; Ontologies; Text categorization; Tree data structures;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN
0-7695-2142-8
Type
conf
DOI
10.1109/ICDM.2004.10077
Filename
1410303
Link To Document