DocumentCode
617853
Title
An evolutionary algorithm for Feature Selective Double Clustering of text documents
Author
Nourashrafeddin, S.N. ; Milios, Evangelos ; Arnold, Dirk V.
Author_Institution
Fac. of Comput. Sci., Dalhousie Univ., Halifax, NS, Canada
fYear
2013
fDate
20-23 June 2013
Firstpage
446
Lastpage
453
Abstract
We propose FSDC, an evolutionary algorithm for Feature Selective Double Clustering of text documents. We first cluster the terms existing in the document corpus. The term clusters are then fed into multiobjective genetic algorithms to prune non-informative terms and form sets of keyterms representing topics. Based on the topic keyterms found, representative documents for each topic are extracted. These documents are then used as seeds to cluster all documents in the dataset. FSDC is compared to some well-known co-clusterers on real text datasets. The experimental results show that our algorithm can outperform the competitors.
Keywords
feature extraction; genetic algorithms; pattern clustering; text analysis; FSDC; document corpus; evolutionary algorithm; feature selective double clustering; multiobjective genetic algorithms; noninformative term pruning; real text datasets; representative document extraction; text documents; topic keyterm set; Approximation algorithms; Clustering algorithms; Computer science; Evolutionary computation; Genetic algorithms; Mutual information; Vectors; Genetic algorithm; co-clustering; multiobjective optimization; text clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Evolutionary Computation (CEC), 2013 IEEE Congress on
Conference_Location
Cancun
Print_ISBN
978-1-4799-0453-2
Electronic_ISBN
978-1-4799-0452-5
Type
conf
DOI
10.1109/CEC.2013.6557603
Filename
6557603
Link To Document