Title :
An evolutionary algorithm for Feature Selective Double Clustering of text documents
Author :
Nourashrafeddin, S.N. ; Milios, Evangelos ; Arnold, Dirk V.
Author_Institution :
Fac. of Comput. Sci., Dalhousie Univ., Halifax, NS, Canada
Abstract :
We propose FSDC, an evolutionary algorithm for Feature Selective Double Clustering of text documents. We first cluster the terms existing in the document corpus. The term clusters are then fed into multiobjective genetic algorithms to prune non-informative terms and form sets of keyterms representing topics. Based on the topic keyterms found, representative documents for each topic are extracted. These documents are then used as seeds to cluster all documents in the dataset. FSDC is compared to some well-known co-clusterers on real text datasets. The experimental results show that our algorithm can outperform the competitors.
Keywords :
feature extraction; genetic algorithms; pattern clustering; text analysis; FSDC; document corpus; evolutionary algorithm; feature selective double clustering; multiobjective genetic algorithms; noninformative term pruning; real text datasets; representative document extraction; text documents; topic keyterm set; Approximation algorithms; Clustering algorithms; Computer science; Evolutionary computation; Genetic algorithms; Mutual information; Vectors; Genetic algorithm; co-clustering; multiobjective optimization; text clustering;
Conference_Titel :
Evolutionary Computation (CEC), 2013 IEEE Congress on
Conference_Location :
Cancun
Print_ISBN :
978-1-4799-0453-2
Electronic_ISBN :
978-1-4799-0452-5
DOI :
10.1109/CEC.2013.6557603