DocumentCode
3350360
Title
Finding and visualizing relevant subspaces for clustering high-dimensional astronomical data using connected morphological operators
Author
Ferdosi, Bilkis J. ; Buddelmeijer, Hugo ; Trager, Scott ; Wilkinson, Michael H F ; Roerdink, Jos B T M
Author_Institution
Johann Bernoulli Inst. for Math. & Comput. Sci., Univ. of Groningen, Groningen, Netherlands
fYear
2010
fDate
25-26 Oct. 2010
Firstpage
35
Lastpage
42
Abstract
Data sets in astronomy are growing to enormous sizes. Modern astronomical surveys provide not only image data but also catalogues of millions of objects (stars, galaxies), each object with hundreds of associated parameters. Exploration of this very high-dimensional data space poses a huge challenge. Subspace clustering is one among several approaches which have been proposed for this purpose in recent years. However, many clustering algorithms require the user to set a large number of parameters without any guidelines. Some methods also do not provide a concise summary of the datasets, or, if they do, they lack additional important information such as the number of clusters present or the significance of the clusters. In this paper, we propose a method for ranking subspaces for clustering which overcomes many of the above limitations. First we carry out a transformation from parametric space to discrete image space where the data are represented by a grid-based density field. Then we apply so-called connected morphological operators on this density field of astronomical objects that provides visual support for the analysis of the important subspaces. Clusters in subspaces correspond to high-intensity regions in the density image. The importance of a cluster is measured by a new quality criterion based on the dynamics of local maxima of the density. Connected operators are able to extract such regions with an indication of the number of clusters present. The subspaces are visualized during computation of the quality measure, so that the user can interact with the system to improve the results. In the result stage, we use three visualization toolkits linked within a graphical user interface so that the user can perform an in-depth exploration of the ranked subspaces. Evaluation based on synthetic as well as real astronomical datasets demonstrates the power of the new method. We recover various known astronomical relations directly from the data with little or no a pri- - ori assumptions. Hence, our method holds good prospects for discovering new relations as well.
Keywords
data visualisation; graphical user interfaces; pattern clustering; spatial data structures; visual databases; catalogues; connected morphological operators; data clustering; data representation; data sets; data visualization; discrete image space; graphical user interface; grid-based density field; high-dimensional astronomical data; image data; local maxima; modern astronomical surveys; parametric space; subspaces; visualization toolkits; Data visualization; Estimation; Noise; Principal component analysis; Shape; Smoothing methods; Visualization; Subspace finding; astronomical data; clustering high-dimensional data; connected morphological operators; visual exploration;
fLanguage
English
Publisher
ieee
Conference_Titel
Visual Analytics Science and Technology (VAST), 2010 IEEE Symposium on
Conference_Location
Salt Lake City, UT
Print_ISBN
978-1-4244-9488-0
Electronic_ISBN
978-1-4244-9487-3
Type
conf
DOI
10.1109/VAST.2010.5652450
Filename
5652450
Link To Document