Title of article :
Means and variances for a family of similarity indices used in cluster analysis
Author/Authors :
Albatineh، نويسنده , , Ahmed N.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
11
From page :
2828
To page :
2838
Abstract :
Albatineh et al. (2006) introduced a family L of similarity indices. Members of this family are linear functions of the matching counts matrix [mij], where mij is the number of common elements between the i th and j th clusters resulting from two clusterings of the same data set. Fowlkes and Mallows (1983) derived the mean and variance for Rand (1971) index and an index they called Bk (which is actually attributed to Ochiai, 1957) under fixed marginal totals of the matching counts matrix and independence of the clustering algorithms. This paper generalizes the derivation of Fowlkes and Mallows (1983) for the mean and variance to any member of the L family which makes the problem of comparison of a wide family of indices much easier. Monte Carlo simulations are implemented to compare shapes, means and variances for nine members of the L family for null case data (without clustering structure). Structured case simulations are implemented to evaluate the nine indices as tools for measuring cluster structure recovery. Data were generated from bivariate normal distributions.
Keywords :
Similarity index , clustering algorithm , Rand index , Matching counts , Cluster analysis
Journal title :
Journal of Statistical Planning and Inference
Serial Year :
2010
Journal title :
Journal of Statistical Planning and Inference
Record number :
2220902
Link To Document :
بازگشت