DocumentCode :
2895590
Title :
Simmered Greedy Optimization for Co-clustering
Author :
Kapadia, Sadik ; Rohwer, Richard
Author_Institution :
Google, Inc., Mountain View, CA, USA
fYear :
2010
fDate :
12-14 April 2010
Firstpage :
410
Lastpage :
419
Abstract :
We present a fast yet highly effective stochastic algorithm, Simmered Greedy Optimization (SG(N)) for solving the co-clustering problem: to simultaneously cluster two finite sets by maximizing the mutual information between the clusterings. (Clustering one set by this criterion is a special case.) This is a combinatorial optimization problem of great interest for deriving maximally predictive feature sets. Co-clustering has found applications in many areas, particularly statistical natural language processing and bioinformatics. We report results of tests on a suite of statistical natural language problems, comparing SG(N) with simulated annealing and a publicly available implementation of co-clustering. In all cases we obtain superior results with far less computation using SG(N).
Keywords :
combinatorial mathematics; natural languages; pattern clustering; simulated annealing; statistical analysis; co-clustering problem; combinatorial optimization problem; mutual information; simmered greedy optimization; simulated annealing; statistical natural language problems; Bioinformatics; Clustering algorithms; Computational modeling; Information technology; Mutual information; Natural language processing; Scheduling algorithm; Simulated annealing; Stochastic processes; Testing; benchmarking; combinatorial optimization; meta-heuristic; simulated annealing; unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology: New Generations (ITNG), 2010 Seventh International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-6270-4
Type :
conf
DOI :
10.1109/ITNG.2010.110
Filename :
5501693
Link To Document :
بازگشت