DocumentCode
436137
Title
Genetic uxtraction of text category descriptions
Author
Serrano, J.I. ; del Castillo, M.D.
Author_Institution
Instituto de Automatica Industrial, CSIC. Ctra. Campo Real, km. 0.200. La Poveda, Arganda del Rey, 28500 Madrid, SPAIN
Volume
16
fYear
2004
fDate
June 28 2004-July 1 2004
Firstpage
7
Lastpage
12
Abstract
This paper deals with a supervised learning method devoted to producing categorization models of text documents. The goal of the method is to use a suitable numerical measurement of example similarity to find centroids describing different categories of examples. The centroids are neither abstract nor statistical models, but rather consist of bits of examples. The centroid-learning method is based on a genetic algorithm, the GAT. The categorization system infers a model by applying the GAT to the set of preclassified documents. The models thus obtained arc the category centroids that are used to predict the category of a new document.
Keywords
Genetic algorithms; Humans; Internet; Machine learning; Natural languages; Organizing; Predictive models; Supervised learning; Terminology; Text categorization; centroid; evolutionary learning; similarity function; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Automation Congress, 2004. Proceedings. World
Conference_Location
Seville
Print_ISBN
1-889335-21-5
Type
conf
Filename
1438624
Link To Document