• DocumentCode
    1986970
  • Title

    Discovering compact and highly discriminative features or combinations of drug activities using support vector machines

  • Author

    Yu, Hwanjo ; Yang, Jian ; Wang, W. ; Han, Jiawei

  • Author_Institution
    Dept. of Comput. Sci., Illinois Univ., Urbana, IL, USA
  • fYear
    2003
  • fDate
    11-14 Aug. 2003
  • Firstpage
    220
  • Lastpage
    228
  • Abstract
    Nowadays, high throughput experimental techniques make it feasible to examine and collect massive data at the molecular level. These data, typically mapped to a very high dimensional feature space, carry rich information about functionalities of certain chemical or biological entities and can be used to infer valuable knowledge for the purposes of classification and prediction. Typically, a small number of features or feature combinations may play determinant roles in functional discrimination. The identification of such features or feature combinations is of great importance. In this paper, we study the problem of discovering compact and highly discriminative features or feature combinations from a rich feature collection. We employ the support vector machine as the classification means and aim at finding compact feature combinations. Comparing to previous methods on feature selection, which identify features solely based on their individual roles in the classification, our method is able to identify minimal feature combinations that ultimately have determinant roles in a systematic fashion. Experimental study on drug activity data shows that our method can discover descriptors that are not necessarily significant individually but are most significant collectively.
  • Keywords
    biology computing; data mining; drugs; feature extraction; support vector machines; biological entities; chemical entities; compact feature discovering; drug activities; drug activity data; feature collection; feature combinations; feature selection; functional discrimination; high dimensional feature space; high throughput experimental techniques; massive data collection; molecular level; support vector machine; support vector machines; Bioinformatics; Drugs; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
  • Print_ISBN
    0-7695-2000-6
  • Type

    conf

  • DOI
    10.1109/CSB.2003.1227321
  • Filename
    1227321