Title :
Extended ACO based document clustering with hybrid distance metric
Author :
Subhadra, K. ; Shashi, M. ; Das, Abhishek
Author_Institution :
Dept. of CSE, GITAM Univ., Visakhapatnam, India
Abstract :
Large amount of high dimensional data has to be handled often to solve problems that arise in the field of information retrieval. This paper deals with the problem of grouping similar documents into clusters and then retrieving the required document with respect to the user´s query. This work introduces a novel approach of document clustering which combines swarm intelligence techniquebased on the brood behavior of ants with standard clustering approaches. The main idea behind this paper is to apply nature inspired algorithm, Ant Colony Optimization (ACO) Algorithm for limited number of iterations followed by medoidbased post pruning. The proposed method has been applied to the dataset formed by collecting 10000 documents on various topics from the standard data repository Wikipedia. The experimental results proved that the proposed method achieved better clustering results in terms of precision and recall and in a very less time.
Keywords :
ant colony optimisation; document handling; pattern clustering; query processing; swarm intelligence; ACO algorithm; ACO based document clustering; Wikipedia; ant colony optimization; ants behavior; data repository; document retrieval; high dimensional data; hybrid distance metric; information retrieval; medoid based post pruning; nature inspired algorithm; similar documents grouping; swarm intelligence technique; user query; Complexity theory; Electronic publishing; Encyclopedias; Internet; Measurement; Clustering; optimization; purity; recall; swarm intelligence;
Conference_Titel :
Electrical, Computer and Communication Technologies (ICECCT), 2015 IEEE International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4799-6084-2
DOI :
10.1109/ICECCT.2015.7226090