DocumentCode :
3372956
Title :
Learning metrics for exploratory data analysis
Author :
Kaski, Samuel
Author_Institution :
Neural Networks Res. Centre, Helsinki Univ. of Technol., Finland
fYear :
2001
fDate :
2001
Firstpage :
53
Lastpage :
62
Abstract :
Visualization and cluster analysis of multivariate data is usually based on distances between samples in a data space. The distance measure is often heuristically chosen, for instance by choosing suitable features and then using a global Euclidean metric. We have developed methods that remove the arbitrariness by measuring distances only along important (local) directions. The metric is learned from auxiliary data that is paired with the primary data during the learning process. It is assumed that changes in the primary data are important or relevant if they cause changes in the auxiliary data; for example, in analysis of gene expression the auxiliary data can indicate the functional classes of the genes. The new distance measures can be used for instance in clustering and Self-Organizing Map-based data visualization. The methods have so far been applied in analysis of bankruptcy, text documents, and gene expression
Keywords :
data analysis; self-organising feature maps; software metrics; unsupervised learning; Self-Organizing Map-based data visualization; Unsupervised learning; auxiliary data; cluster analysis; data analysis; distance measures; exploratory data analysis; gene expression; Chromium; Data analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks for Signal Processing XI, 2001. Proceedings of the 2001 IEEE Signal Processing Society Workshop
Conference_Location :
North Falmouth, MA
ISSN :
1089-3555
Print_ISBN :
0-7803-7196-8
Type :
conf
DOI :
10.1109/NNSP.2001.943110
Filename :
943110
Link To Document :
بازگشت