DocumentCode
586479
Title
Dis-function: Learning distance functions interactively
Author
Brown, Eli T. ; Jingjing Liu ; Brodley, Carla E. ; Chang, Ronald
Author_Institution
Dept. of Comput. Sci., Tufts Univ., Medford, MA, USA
fYear
2012
fDate
14-19 Oct. 2012
Firstpage
83
Lastpage
92
Abstract
The world´s corpora of data grow in size and complexity every day, making it increasingly difficult for experts to make sense out of their data. Although machine learning offers algorithms for finding patterns in data automatically, they often require algorithm-specific parameters, such as an appropriate distance function, which are outside the purview of a domain expert. We present a system that allows an expert to interact directly with a visual representation of the data to define an appropriate distance function, thus avoiding direct manipulation of obtuse model parameters. Adopting an iterative approach, our system first assumes a uniformly weighted Euclidean distance function and projects the data into a two-dimensional scatterplot view. The user can then move incorrectly-positioned data points to locations that reflect his or her understanding of the similarity of those data points relative to the other data points. Based on this input, the system performs an optimization to learn a new distance function and then re-projects the data to redraw the scatter-plot. We illustrate empirically that with only a few iterations of interaction and optimization, a user can achieve a scatterplot view and its corresponding distance function that reflect the user´s knowledge of the data. In addition, we evaluate our system to assess scalability in data size and data dimension, and show that our system is computationally efficient and can provide an interactive or near-interactive user experience.
Keywords
data analysis; data visualisation; iterative methods; learning (artificial intelligence); pattern classification; classification; data visual representation; dis-function; iterative approach; machine learning; optimization; two-dimensional scatterplot view; uniformly weighted Euclidean distance function; Bars; Data visualization; Euclidean distance; Machine learning; Vectors; Yttrium;
fLanguage
English
Publisher
ieee
Conference_Titel
Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on
Conference_Location
Seattle, WA
Print_ISBN
978-1-4673-4752-5
Type
conf
DOI
10.1109/VAST.2012.6400486
Filename
6400486
Link To Document