Title :
Similarity between categorisations
Author :
Birkenhead, Ralph
Author_Institution :
Fac. of Comput. Sci. & Eng., De Montfort Univ., Leicester, UK
Abstract :
The problem of assessing the similarity of two groupings of data into categories is considered. Such groupings may arise as a consequence of subjective categorisation by different people or by repeated machine classification by an algorithm which is dependent on the order of data submission or even non-deterministic. Several self organising classification algorithms (e.g. fuzzy ART and fuzzy Min-Max) suffer from this property. In either case the question of the reliability of the categorisation can only be answered if some notion of distance between different categorisations is available. (This is because the question of reliability of the method does not depend on always producing exactly the same classification but merely on producing sufficiently close classifications). With such a notion it is possible to conduct experimental work on any data set to ensure that there is no important variation in the categorisation depending on the order in which observations are submitted to the classifier. Several different methods of measuring distance are proposed and investigated. Importantly two of these measures are conducted in the appropriate algebraic setting in which to compare partitions of sets-the lattice of equivalence relations on a finite set. Whilst this involves some computational difficulties it is hoped that the benefits of an appropriate measure outweigh them. A comparison of two of these distance measures using a known data set from the literature is performed, and the results reported
Keywords :
ART neural nets; fuzzy neural nets; pattern classification; set theory; categorisations; equivalence relations; fuzzy ART; fuzzy Min-Max; groupings; machine classification; self organising classification algorithms; set partitioning; Biomedical equipment; Classification algorithms; Computational intelligence; Data engineering; Lattices; Medical services; Medical tests; Performance analysis; Performance evaluation; Subspace constraints;
Conference_Titel :
Systems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
0-7803-4778-1
DOI :
10.1109/ICSMC.1998.726715