Title :
A hyper-heuristic approach to design and tuning heuristic methods for web document clustering
Author :
Cobos, Carlos ; Mendoza, Martha ; León, Elizabeth
Author_Institution :
Comput. Sci. Dept., Univ. del Cauca, Popayan, Colombia
Abstract :
This paper introduces a new description-centric algorithm for web document clustering called HHWDC. The HHWDC algorithm has been designed from a hyper-heuristic approach and allows defining the best algorithm for web document clustering. HHWDC uses as heuristic selection methodology two options, namely: random selection and roulette wheel selection based on performance of low-level heuristics (harmony search, an improved harmony search, a novel global harmony search, global-best harmony search, restrictive mating, roulette wheel selection, and particle swarm optimization). HHWDC uses the k-means algorithm for local solution improvement strategy, and based on the Bayesian Information Criteria is able to automatically define the number of clusters. HHWDC uses two acceptance/replace strategies, namely: Replace the worst and Restricted Competition Replacement. HHWDC was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm).
Keywords :
Bayes methods; Internet; document handling; particle swarm optimisation; pattern clustering; Bayesian information criteria; DMOZ; HHWDC algorithm; Web document clustering; data set; description-centric algorithm; heuristic method design; heuristic method tuning; hyper-heuristic approach; k-means algorithm; local solution improvement strategy; restricted competition replacement; Algorithm design and analysis; Bandwidth; Clustering algorithms; Heuristic algorithms; Particle swarm optimization; Partitioning algorithms; Time division multiplexing; genetic algorithm; harmony search; hyper-heuristic; memetic algorithm; particle swarm; web document clustering;
Conference_Titel :
Evolutionary Computation (CEC), 2011 IEEE Congress on
Conference_Location :
New Orleans, LA
Print_ISBN :
978-1-4244-7834-7
DOI :
10.1109/CEC.2011.5949773