• DocumentCode
    2662161
  • Title

    Distance learning for categorical attribute based on context information

  • Author

    Khorshidpour, Zeinab ; Hashemi, Sattar ; Hamzeh, Ali

  • Author_Institution
    Dept. Electron. & Comput. Eng., Shiraz Univ., Shiraz, Iran
  • Volume
    2
  • fYear
    2010
  • fDate
    3-5 Oct. 2010
  • Abstract
    In this paper, we propose a novel method to measure the dissimilarity of categorical data. Our approach is based on two steps, in the first step we select a relevant subset of the whole attributes set that we use as the context for a given attribute and in the second step computes dissimilarity between pair of values of the same attribute using the context defined in the previous step. Dissimilarity between two categorical values of an attribute compute as a combination of dissimilarities between the conditional probability distributions of context attributes given these two values. Experiments with real data show that our dissimilarity estimation method improves the accuracy of the popular nearest neighbor classifier.
  • Keywords
    data mining; distance learning; probability; ubiquitous computing; categorical attribute; categorical data dissimilarity measurement; conditional probability distributions; context attribute; context information; distance learning; nearest neighbor classifier; Categorical data; Distance function learning; Irrelevant feature; Nearest neighbor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Technology and Engineering (ICSTE), 2010 2nd International Conference on
  • Conference_Location
    San Juan, PR
  • Print_ISBN
    978-1-4244-8667-0
  • Electronic_ISBN
    978-1-4244-8666-3
  • Type

    conf

  • DOI
    10.1109/ICSTE.2010.5608801
  • Filename
    5608801