Title :
Rule-based text categorization using hierarchical categories
Author :
Sasaki, Minoru ; Kita, Kenji
Author_Institution :
Fac. of Eng., Tokushima Univ., Japan
Abstract :
Document categorization, which is defined as the classification of text documents into one of several fixed classes or categories, has become important with the explosive growth of the World Wide Web. The goal of the work described here is to automatically categorize Web documents in order to enable effective retrieval of Web information. In this paper, based on the rule learning algorithm RIPPER (for Repeated Incremental Pruning to Produce Error Reduction), we propose an efficient method for hierarchical document categorization
Keywords :
classification; data mining; information resources; information retrieval; knowledge based systems; learning (artificial intelligence); Web information retrieval; World Wide Web; document categorization; hierarchical categories; hierarchical document categorization; rule learning algorithm RIPPER; rule-based text categorization; text documents classification; Data mining; Explosives; Filtering; Humans; Learning systems; Painting; Partitioning algorithms; Text categorization; Web sites; World Wide Web;
Conference_Titel :
Systems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
0-7803-4778-1
DOI :
10.1109/ICSMC.1998.725090