Title :
A Method for Implementing a Statistically Significant Number of Data Classes in the Jenks Algorithm
Author :
North, Matthew A.
Author_Institution :
Washington & Jefferson Coll., WA, USA
Abstract :
The Jenks natural breaks algorithm is a standard method for dividing a dataset into a certain number of homogenous classes. The algorithm is commonly used in geographic information systems (GIS) applications. One major drawback to the use of Jenks in this context is that the number of desired classes must be indicated before the algorithm is applied to the dataset. Without a mechanism for determining the appropriate number of classes for a given dataset, the results of Jenks classification may be inaccurate, or worse, arbitrary. This paper proposes a method for determining, through iterative tests of statistical significance, the appropriate number of classes for a data set of any given number of observations. Pseudo-code for the method is provided.
Keywords :
data handling; geographic information systems; statistical analysis; GIS; Jenks classification; Jenks natural breaks algorithm; data classes; geographic information system; pseudocode; statistical method; Algorithm design and analysis; Counting circuits; Data analysis; Educational institutions; Fuzzy systems; Geographic Information Systems; Iterative algorithms; Iterative methods; Optimization methods; Testing; Jenks Algorithm; categorical data; classification; statistical significance;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3735-1
DOI :
10.1109/FSKD.2009.319