DocumentCode
2448932
Title
Uniformity testing using minimal spanning tree
Author
Jain, Anil K. ; Xu, Xiaowei ; Ho, Tin Kam ; Xiao, Fan
Author_Institution
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
Volume
4
fYear
2002
fDate
2002
Firstpage
281
Abstract
Testing for uniformity of multivariate data is the initial step in exploratory pattern analysis. We propose a new uniformity testing method, which first computes the maximum (standardized) edge length in the MST of the given data. Large lengths indicate the existence of well-separated clusters or outliers in the data. For the data passing this edge inconsistency test, we generate two sub-samples of the data by a weighted re-sampling method, where the weights are computed based on the normalized edge lengths of MST of the entire data. The uniformity of the data is estimated by running the two-sample MST-test on these two sub-samples. Experiments with simulated and real data show the potential of the proposed test in identifying uniform or weakly clustered data. This test can also be used to rank various data sets based on their degree of uniformity.
Keywords
pattern clustering; statistical analysis; trees (mathematics); edge inconsistency test; exploratory pattern analysis; maximum edge length; minimal spanning tree; multivariate data; outliers; uniformity testing; weighted re-sampling method; well-separated clusters; Computational modeling; Computer science; Data engineering; Multidimensional systems; Pattern analysis; Pattern recognition; Sampling methods; Shape; Testing; Tin;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 2002. Proceedings. 16th International Conference on
ISSN
1051-4651
Print_ISBN
0-7695-1695-X
Type
conf
DOI
10.1109/ICPR.2002.1047451
Filename
1047451
Link To Document