Title :
Confident Identification of Relevant Objects Based on Nonlinear Rescaling Method and Transductive Inference
Author :
Ho, Shen-Shyang ; Polyak, Roman
Author_Institution :
George Mason Univ., Fairfax
Abstract :
We present a novel machine learning algorithm to identify relevant objects from a large amount of data. This approach is driven by linear discrimination based on nonlinear rescaling (NR) method and transductive inference. The NR algorithm for linear discrimination (NRLD) computes both the primal and the dual approximation at each step. The dual variables associated with the given labeled data-set provide important information about the objects in the data-set and play the key role in ordering these objects. A confidence score based on a transductive inference procedure using NRLD is used to rank and identify the relevant objects from a pool of unlabeled data. Experimental results on an unbalanced protein data-set for the drug target prioritization and identification problem are used to illustrate the feasibility of the proposed identification algorithm.
Keywords :
approximation theory; data handling; drugs; learning (artificial intelligence); proteins; confidence score; confident identification; drug identification problem; drug target prioritization; dual approximation; linear discrimination; machine learning algorithm; nonlinear rescaling method; relevant objects; transductive inference; unbalanced protein data-set; Approximation algorithms; Computer science; Data mining; Drugs; Inference algorithms; Lagrangian functions; Machine learning algorithms; Proteins; Support vector machine classification; Support vector machines;
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
Print_ISBN :
978-0-7695-3018-5
DOI :
10.1109/ICDM.2007.24