• DocumentCode
    436137
  • Title

    Genetic uxtraction of text category descriptions

  • Author

    Serrano, J.I. ; del Castillo, M.D.

  • Author_Institution
    Instituto de Automatica Industrial, CSIC. Ctra. Campo Real, km. 0.200. La Poveda, Arganda del Rey, 28500 Madrid, SPAIN
  • Volume
    16
  • fYear
    2004
  • fDate
    June 28 2004-July 1 2004
  • Firstpage
    7
  • Lastpage
    12
  • Abstract
    This paper deals with a supervised learning method devoted to producing categorization models of text documents. The goal of the method is to use a suitable numerical measurement of example similarity to find centroids describing different categories of examples. The centroids are neither abstract nor statistical models, but rather consist of bits of examples. The centroid-learning method is based on a genetic algorithm, the GAT. The categorization system infers a model by applying the GAT to the set of preclassified documents. The models thus obtained arc the category centroids that are used to predict the category of a new document.
  • Keywords
    Genetic algorithms; Humans; Internet; Machine learning; Natural languages; Organizing; Predictive models; Supervised learning; Terminology; Text categorization; centroid; evolutionary learning; similarity function; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automation Congress, 2004. Proceedings. World
  • Conference_Location
    Seville
  • Print_ISBN
    1-889335-21-5
  • Type

    conf

  • Filename
    1438624