• DocumentCode
    3722384
  • Title

    Clustering of Complex Data-Sets Using Fractal Similarity Measures and Uncertainties

  • Author

    Maximilian Hoecker;Kai Lars Polsterer; K?gler;Vincent Heuveline

  • Author_Institution
    Heidelberg Univ., Heidelberg, Germany
  • fYear
    2015
  • Firstpage
    82
  • Lastpage
    91
  • Abstract
    The unsupervised analysis of data-sets, both large in dimension as well as in number of objects, are one of the most challenging tasks in data intense sciences. Especially in astronomy, dedicated survey telescopes generate an enormous amount of complex data. For example the database of the Sloan Digital Sky Survey (SDSS DR10) contains 3 million spectra with ca. 5,000 values each. Analyzing those spectra is computationally demanding when applying standard techniques and standard similarity measures. In addition to the big data aspects one has to deal with the uncertainties of the measurements. We present a generic and noise tolerant similarity measure which is based on box counting methods and comparable to calculating fractal dimensions. Besides the theoretical aspects of the proposed method, the implementation details as well as the achieved evaluation results are discussed in this paper. Our implementation exploits current affordable computing architectures with large memory resources. The Fractal Similarity Measure enables scientists to perform clustering, classification and outlier detection in nowadays databases. Event though this is a generic method, the experiments shown in this paper demonstrate the performance just for clustering.
  • Keywords
    "Fractals","Extraterrestrial measurements","Clustering algorithms","Mathematical model","Shape","Uncertainty","Astronomy"
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Engineering (CSE), 2015 IEEE 18th International Conference on
  • Type

    conf

  • DOI
    10.1109/CSE.2015.35
  • Filename
    7371359