• DocumentCode
    3189336
  • Title

    Distance Metric Learning through Optimization of Ranking

  • Author

    Gopal, Kreshna ; Ioerger, Thomas R.

  • Author_Institution
    Texas A&M Univ., College Station
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    201
  • Lastpage
    206
  • Abstract
    Data preprocessing is important in machine learning, data mining, and pattern recognition. In particular, selecting relevant features in high- dimensional data is often necessary to efficiently construct models that accurately describe the data. For example, many lazy learning algorithms (like k- Nearest Neighbor) rely on feature-based distance metrics to compare input patterns for the purpose of classification or retrieval from a database. In previous work, we introduced Slider, a distance metric learning method that optimizes the weights of features in a protein model-building application (where features are used to describe patterns of electron density around protein macromolecules). In this work, we demonstrate the usefulness of Slider as a general method for classification, ranking and retrieval, with results on several benchmark datasets. We also compare it to other well-known feature selection or weighting methods.
  • Keywords
    biology computing; data mining; information retrieval; learning (artificial intelligence); optimisation; pattern classification; proteins; Slider algorithm; data mining; data preprocessing; distance metric learning; information retrieval; machine learning; optimization; pattern classification; pattern recognition; protein model-building application; Data mining; Data preprocessing; Information retrieval; Learning systems; Machine learning; Machine learning algorithms; Nearest neighbor searches; Pattern recognition; Proteins; Spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on
  • Conference_Location
    Omaha, NE
  • Print_ISBN
    978-0-7695-3019-2
  • Electronic_ISBN
    978-0-7695-3033-8
  • Type

    conf

  • DOI
    10.1109/ICDMW.2007.113
  • Filename
    4476668