• DocumentCode
    3739243
  • Title

    New Quality Indexes for Optimal Clustering Model Identification with High Dimensional Data

  • Author

    Jean-Charles Lamirel;Pascal Cuxac

  • Author_Institution
    Synalp Team, LORIA, Vandoeuvre les Nancy, France
  • fYear
    2015
  • Firstpage
    855
  • Lastpage
    862
  • Abstract
    Feature maximization is an alternative measure to usual distributional measures relying on entropy or on Chi-square metric or vector-based measures such as Euclidean distance or correlation distance. One of the key advantages of this measure is that it is operational in an incremental mode both on clustering and on traditional classification. In the classification framework, it does not present the limitations of the aforementioned measures in the case of the processing of highly unbalanced, heterogeneous and highly multidimensional data. We shall present a new application of this measure in the clustering context for the creation of new cluster quality indexes which can be efficiently applied for a low-to-high dimensional range of data and which are tolerant to noise. We shall compare the behavior of these new indexes with usual cluster quality indexes based on Euclidean distance on different kinds of test datasets for which ground truth is available. This comparison clearly highlights the superior accuracy and stability of the new method.
  • Keywords
    "Indexes","Context","Clustering methods","Footwear","Euclidean distance","Hair","Entropy"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshop (ICDMW), 2015 IEEE International Conference on
  • Electronic_ISBN
    2375-9259
  • Type

    conf

  • DOI
    10.1109/ICDMW.2015.220
  • Filename
    7395757