Title :
The impact of data normalisation on unsupervised continuous classification of landforms
Author_Institution :
King´´s Coll., London, UK
Abstract :
Although most clustering techniques that characterise output classes by mean vectors and a covariance matrix require that features are roughly Gaussian, feature assessment for normality is often ignored in the classification process. This paper presents an example of the effect of data transformation on unsupervised continuous classification of landforms using the fuzzy k-means with extragrades algorithm on morphometric variables (MVs) derived from a DEM. Results show that after being transformed all MVs distributions become more normal and homogeneous. The optimised solution of number of classes given by the minimisation of FPI and MPE show significant changes between untransformed and normalised results. The use of normalised data achieved better results in terms of finding more distinct and less disorganised substructures for the same optimal number of classes. However, class separability was reduced due to data homogenisation. Furthermore, the resulting classes produced using the two different data sets show a small degree of overlap. Thus, not only data transformation affected the optimal solution pair in terms of number of classes and degree of fuzziness, it also has a major impact on the classification of each individual pixel, demonstrating the importance of data normalisation in the unsupervised classification process and in the delineation of landforms.
Keywords :
Gaussian processes; covariance matrices; feature extraction; fuzzy logic; geophysical signal processing; geophysical techniques; pattern classification; pattern clustering; soil; terrain mapping; DEM; FPI; Gaussian processes; MPE; classification process; clustering techniques; covariance matrix; data homogenisation; data normalisation; data transformation; delineation; disorganised substructures; feature assessment; fuzziness; fuzzy k-means; fuzzy performance index; individual pixel; landforms; modified partition entropy; morphometric variables; optimised solution; unsupervised continuous classification; Clustering methods; Covariance matrix; Data analysis; Educational institutions; Frequency; Fuzzy sets; Geography; Pattern recognition; Performance analysis; Soil properties;
Conference_Titel :
Geoscience and Remote Sensing Symposium, 2003. IGARSS '03. Proceedings. 2003 IEEE International
Print_ISBN :
0-7803-7929-2
DOI :
10.1109/IGARSS.2003.1294810