Title :
General MC: estimating boundary of positive class from small positive data
Author_Institution :
Dept. of Comput. Sci., Illinois Univ., Urbana, IL, USA
Abstract :
Single-class classification (SCC) seeks to distinguish one class of data from the universal set of multiple classes. We propose a SCC method called general MC that estimates an accurate classification boundary of positive class from small positive data using the distribution of unlabeled data. Our theoretical and empirical analyses show that, as long as the distribution of unlabeled data is not highly skewed in the feature space, general MC significantly outperforms other recent SCC methods when the positive data set is highly under-sampled.
Keywords :
character recognition; learning (artificial intelligence); pattern classification; statistical analysis; support vector machines; classification boundary; empirical analyses; feature space; general MC; positive data set; single-class classification; unlabeled data; Algorithm design and analysis; Computer science; Convergence; Functional analysis; Image analysis; Image databases; Performance analysis; Resumes; Training data; Web pages;
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
DOI :
10.1109/ICDM.2003.1251010