• DocumentCode
    1380337
  • Title

    Clustering without a metric

  • Author

    Matthews, Geoffrey ; Hearne, James

  • Author_Institution
    Dept. of Comput. Sci., Western Washington Univ., Bellingham, WA, USA
  • Volume
    13
  • Issue
    2
  • fYear
    1991
  • fDate
    2/1/1991 12:00:00 AM
  • Firstpage
    175
  • Lastpage
    184
  • Abstract
    A methodology for clustering data in which a distance metric or similarity function is not used is described. Instead, clusterings are optimized based on their intended function: the accurate prediction of properties of the data. The resulting clustering methodology is applicable, without further ad hoc assumptions or transformations of the data, (1) when features are heterogeneous (both discrete and continuous) and not combinable, (2) where some data points have missing feature values, and (3) where some features are irrelevant, i.e. have large variance but little correlation with other features. Further, it provides an integral measure of the quality of the resulting clustering. A clustering program, RIFFLE, has been implemented in line with this approach, and experiments with synthetic and real data show that the clustering is, in many respects, superior to traditional methods
  • Keywords
    optimisation; pattern recognition; statistical analysis; RIFFLE; clustering; distance metric; optimisation; pattern recognition; similarity function; statistical analysis; Cause effect analysis; Cities and towns; Computer errors; Computer science; Data analysis; Extraterrestrial measurements; Helium; Pattern analysis; Predictive models; Unsupervised learning;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/34.67646
  • Filename
    67646