Title :
A neural network learning relative distances
Author_Institution :
Dept. of Comput. Sci., Marburg Univ., Germany
Abstract :
Data mining and knowledge discovery aim at the detection of new knowledge in data sets produced by some data generating process. There are (at least) two important problems associated with such data sets: missing values and unknown distributions. This is in particular a problem for clustering algorithms, be it statistical or neuronal. For such purposes a distance metric comparing high dimensional vectors is necessary for all data points. Much handwork is necessary in today´s data mining systems to find an appropriate metric. In this work a novel neural network, called ud-net is defined. Ud-nets are able to adapt to unknown distributions of data. The output of the networks may be interpreted as a distance metric. This metric is also defined for data with missing values in some components and the resulting value is comparable to complete data sets. Experiments with known distributions that were distorted by nonlinear transformations show that ud-net produce values on the transformed data that are very comparable to distance measures on the untransformed distributions. Ud-nets are also tested with a data set from a stock picking application. For this example the results of the networks are very similar to results obtained by the application of hand tuned nonlinear transformations to the data set
Keywords :
data mining; learning (artificial intelligence); neural nets; data mining; data sets; high dimensional vectors; knowledge discovery; neural network learning relative distances; ud-net; Clustering algorithms; Computer science; Data mining; Databases; Distortion measurement; Distributed computing; Multidimensional systems; Neural networks; Nonlinear distortion; Testing;
Conference_Titel :
Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on
Conference_Location :
Como
Print_ISBN :
0-7695-0619-4
DOI :
10.1109/IJCNN.2000.861527