DocumentCode :
399380
Title :
Mapping nominal values to numbers for effective visualization
Author :
Rosario, Geraldine E. ; Rundensteiner, Elke A. ; Brown, David C. ; Ward, Matthew O.
Author_Institution :
Dept. of Comput. Sci., Worcester Polytech. Inst., USA
fYear :
2003
fDate :
21-21 Oct. 2003
Firstpage :
113
Lastpage :
120
Abstract :
Data sets with a large number of nominal variables, some with high cardinality, are becoming increasingly common and need to be explored. Unfortunately, most existing visual exploration displays are designed to handle numeric variables only. When importing data sets with nominal values into such visualization tools, most solutions to date are rather simplistic. Often, techniques that map nominal values to numbers do not assign order or spacing among the values in a manner that conveys semantic relationships. Moreover, displays designed for nominal variables usually cannot handle high cardinality variables well. This paper addresses the problem of how to display nominal variables in general-purpose visual exploration tools designed for numeric variables. Specifically, we investigate (1) how to assign order and spacing among the nominal values, and (2) how to reduce the number of distinct values to display. We propose that nominal variables be pre-processed using a distance-quantification-classing (DQC) approach before being imported into a visual exploration tool. In the distance step, we identify a set of independent dimensions that can be used to calculate the distance between nominal values. In the quantification step, we use the independent dimensions and the distance information to assign order and spacing among the nominal values. In the classing step, we use results from the previous steps to determine which values within a variable are similar to each other and thus can be grouped together. Each step in the DQC approach can be accomplished by a variety of techniques. We extended the XmdvTool package to incorporate this approach. We evaluated our approach on several data sets using a variety of evaluation measures.
Keywords :
data compression; data visualisation; mathematics computing; software packages; DQC approach; XmdvTool package; classing step; clustering; correspondence analysis; data sets; data visualization; dimension reduction; distance step; distance-quantification-classing; nominal data; nominal value mapping; nominal variables; norminal values; numeric variables; quantification step; visual exploration displays; visualization tools; Chromium; Computer displays; Computer science; Data visualization; Information analysis; Mathematics; Packaging; Pattern analysis; Pattern recognition; Probability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Visualization, 2003. INFOVIS 2003. IEEE Symposium on
Conference_Location :
Seattle, WA, USA
Print_ISBN :
0-7803-8154-8
Type :
conf
DOI :
10.1109/INFVIS.2003.1249016
Filename :
1249016
Link To Document :
بازگشت