DocumentCode :
3437929
Title :
Can Shared Nearest Neighbors Reduce Hubness in High-Dimensional Spaces?
Author :
Flexer, Arthur ; Schnitzer, Dan
Author_Institution :
Austrian Res. Inst. for Artificial Intell., Vienna, Austria
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
460
Lastpage :
467
Abstract :
´Hubness´ is a recently discovered general problem of machine learning in high dimensional data spaces. Hub objects have a small distance to an exceptionally large number of data points, and anti-hubs are far from all other data points. It is related to the concentration of distances which impairs the contrast of distances in high dimensional spaces. Computation of secondary distances inspired by shared nearest neighbor (SNN) approaches has been shown to reduce hubness and concentration and there already exists some work on direct application of SNN in the context of hubness in image recognition. This study applies SNN to a larger number of high dimensional real world data sets from diverse domains and compares it to two other secondary distance approaches (local scaling and mutual proximity). SNN is shown to reduce hubness but less than other approaches and, contrary to its competitors, it is only able to improve classification accuracy for half of the data sets.
Keywords :
data handling; learning (artificial intelligence); pattern classification; SNN; data points; high dimensional data spaces; image recognition; machine learning; real world data sets; shared nearest neighbors; Accuracy; Conferences; Context; Electronic mail; Histograms; Image recognition; Standards; curse of dimensionality; hubness; machine learning; shared nearest neighors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4799-3143-9
Type :
conf
DOI :
10.1109/ICDMW.2013.101
Filename :
6753957
Link To Document :
بازگشت