Title :
Visualization and knowledge discovery for high dimensional data
Author :
Inselberg, Alfred
Author_Institution :
Sch. of Math. Sci., Tel Aviv Univ., Israel
Abstract :
The goal of the article is to present a multidimensional visualization methodology and its applications to visual and automatic knowledge discovery. Visualization provides insight through images and can be considered as a collection of application specific mappings: ProblemDomain→VisuaLRange. For the visualization of multivariate problems, a multidimensional system of parallel coordinates (||-coords) is constructed which induces a one-to-one mapping between subsets of N-space and subsets of 2-space. The result is a rigorous methodology for doing and seeing N-dimensional geometry. We start with an overview of the mathematical foundations where it is seen that from the display of high-dimensional datasets, the search for multivariate relations among the variables is transformed into a 2D pattern recognition problem. This is the basis for the application to visual knowledge discovery which is illustrated in the second part with a real dataset of VLSI production. Then a recent geometric classifier is presented and applied to 3 real datasets. The results compared to those of 23 other classifiers have the least error. The algorithm has quadratic computational complexity in the size and number of parameters, provides comprehensible and explicit rules, does dimensionality selection, and orders these variables so as to optimize the clarity of separation between the designated set and its complement. Finally a simple visual economic model of a real country is constructed and analyzed in order to illustrate the special strength of ||-coords in modeling multivariate relations by means of hypersurfaces
Keywords :
computational complexity; data mining; data visualisation; pattern classification; set theory; ∥-coords; 2D pattern recognition problem; N-dimensional geometry; VLSI production; application specific mappings; automatic knowledge discovery; dimensionality selection; geometric classifier; high dimensional data; high-dimensional datasets; hypersurfaces; mathematical foundations; multidimensional system; multidimensional visualization methodology; multivariate problems; multivariate relations; one-to-one mapping; parallel coordinates; quadratic computational complexity; real country; real dataset; rigorous methodology; simple visual economic model; visual knowledge discovery; Algorithm design and analysis; Computational complexity; Data visualization; Design optimization; Geometry; Multidimensional systems; Pattern recognition; Production; Two dimensional displays; Very large scale integration;
Conference_Titel :
User Interfaces to Data Intensive Systems, 2001. UIDIS 2001. Proceedings. Second International Workshop on
Conference_Location :
Zurich
Print_ISBN :
0-7695-0834-0
DOI :
10.1109/UIDIS.2001.929921