DocumentCode
2662161
Title
Distance learning for categorical attribute based on context information
Author
Khorshidpour, Zeinab ; Hashemi, Sattar ; Hamzeh, Ali
Author_Institution
Dept. Electron. & Comput. Eng., Shiraz Univ., Shiraz, Iran
Volume
2
fYear
2010
fDate
3-5 Oct. 2010
Abstract
In this paper, we propose a novel method to measure the dissimilarity of categorical data. Our approach is based on two steps, in the first step we select a relevant subset of the whole attributes set that we use as the context for a given attribute and in the second step computes dissimilarity between pair of values of the same attribute using the context defined in the previous step. Dissimilarity between two categorical values of an attribute compute as a combination of dissimilarities between the conditional probability distributions of context attributes given these two values. Experiments with real data show that our dissimilarity estimation method improves the accuracy of the popular nearest neighbor classifier.
Keywords
data mining; distance learning; probability; ubiquitous computing; categorical attribute; categorical data dissimilarity measurement; conditional probability distributions; context attribute; context information; distance learning; nearest neighbor classifier; Categorical data; Distance function learning; Irrelevant feature; Nearest neighbor;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Technology and Engineering (ICSTE), 2010 2nd International Conference on
Conference_Location
San Juan, PR
Print_ISBN
978-1-4244-8667-0
Electronic_ISBN
978-1-4244-8666-3
Type
conf
DOI
10.1109/ICSTE.2010.5608801
Filename
5608801
Link To Document