DocumentCode :
2919117
Title :
CDM: an approach to learning in text categorization
Author :
Goldberg, Jeffrey L.
Author_Institution :
Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
fYear :
1995
fDate :
5-8 Nov 1995
Firstpage :
258
Lastpage :
265
Abstract :
The category discrimination method (CDM) is a new learning algorithm designed for text categorization. The motivation is that there are statistical problems associated with natural language text when it is applied as input to existing machine learning algorithms (too much noise, too many features, skewed distribution). The bases of the CDM are research results about the way that humans learn categories and concepts vis-a-vis contrasting concepts. The essential formula is cue validity borrowed from cognitive psychology, and used to select from all possible single word-based features the `best´ predictors of a given category. The hypothesis that CDM´s performance exceeds two non-domain specific algorithms, Bayesian classification and decision tree learners, is empirically tested
Keywords :
Bayes methods; algorithm theory; category theory; decision theory; document handling; knowledge engineering; learning (artificial intelligence); natural languages; pattern classification; statistical analysis; trees (mathematics); Bayesian classification; algorithm performance; best category predictors; category discrimination method; cognitive psychology; cue validity; decision tree learners; human category learning; human concept learning; learning algorithm; machine learning algorithms; natural language text; nondomain specific algorithms; single word-based features; statistical problems; text categorization; Algorithm design and analysis; Bayesian methods; Classification algorithms; Classification tree analysis; Decision trees; Humans; Machine learning algorithms; Natural languages; Psychology; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 1995. Proceedings., Seventh International Conference on
Conference_Location :
Herndon, VA
ISSN :
1082-3409
Print_ISBN :
0-8186-7312-5
Type :
conf
DOI :
10.1109/TAI.1995.479592
Filename :
479592
Link To Document :
بازگشت