• DocumentCode
    2463916
  • Title

    Document Clustering Using Differential Evolution

  • Author

    Abraham, Ajith ; Das, Swagatam ; Konar, Amit

  • Author_Institution
    Chung Ang Univ., Seoul
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    1784
  • Lastpage
    1791
  • Abstract
    This paper investigates a novel approach for partitional clustering of a large collection of text documents by using an improved version of the classical differential algorithm (DE). Fast and accurate clustering of documents plays an important role in the field of text mining and automatic information retrieval systems. The k-means has served as the most widely used partitional clustering algorithm for text documents. However, in most cases it provides only locally optimal solutions. In this work, the clustering problem has been formulated as an optimization task and is solved using a modified DE algorithm. To reduce the computational time, a hybrid k-means with DE method has also been proposed. The new algorithms were tested on a number of document datasets. Comparison with k-means, a state of the art PSO and one recently proposed real coded GA based text clustering methods reflects the superiority of the proposed techniques in terms of speed and quality of clustering.
  • Keywords
    data mining; document handling; evolutionary computation; information retrieval; pattern clustering; automatic information retrieval systems; differential evolution; document clustering; document datasets; hybrid k-means; partitional clustering; text clustering methods; text documents; text mining; Clustering algorithms; Clustering methods; Computer science; Genetic algorithms; Information retrieval; Particle swarm optimization; Partitioning algorithms; Testing; Text mining; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2006. CEC 2006. IEEE Congress on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    0-7803-9487-9
  • Type

    conf

  • DOI
    10.1109/CEC.2006.1688523
  • Filename
    1688523