• DocumentCode
    2159
  • Title

    Efficient Classification for Metric Data

  • Author

    Gottlieb, Lee-Ad ; Kontorovich, Aryeh ; Krauthgamer, Robert

  • Author_Institution
    Dept. of Comput. Sci. & Math., Ariel Univ. of Samaria, Samaria, Israel
  • Volume
    60
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    5750
  • Lastpage
    5759
  • Abstract
    Recent advances in large-margin classification of data residing in general metric spaces (rather than Hilbert spaces) enable classification under various natural metrics, such as string edit and earthmover distance. A general framework developed for this purpose left open the questions of computational efficiency and of providing direct bounds on generalization error. We design a new algorithm for classification in general metric spaces, whose runtime and accuracy depend on the doubling dimension of the data points, and can thus achieve superior classification performance in many common scenarios. The algorithmic core of our approach is an approximate (rather than exact) solution to the classical problems of Lipschitz extension and of nearest neighbor search. The algorithm´s generalization performance is guaranteed via the fat-shattering dimension of Lipschitz classifiers, and we present experimental evidence of its superiority to some common kernel methods. As a by-product, we offer a new perspective on the nearest neighbor classifier, which yields significantly sharper risk asymptotics than the classic analysis.
  • Keywords
    data handling; pattern classification; Lipschitz classifiers; Lipschitz extension; computational efficiency; data points; earthmover distance; fat-shattering dimension; general metric spaces; generalization error; large-margin data classification; metric data; nearest neighbor search; risk asymptotics; string edit; Algorithm design and analysis; Approximation algorithms; Extraterrestrial measurements; Hilbert space; Nearest neighbor searches; Training; Classification; Lipschitz function; doubling dimension; metric space;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2014.2339840
  • Filename
    6867374