Title :
Transferred Feature Selection
Author :
Bi, Wei ; Shi, Yuan ; Lan, Zhenzhong
Author_Institution :
Dept. of Comput. Sci., Sun Yat-sen Univ., Guangzhou, China
Abstract :
Traditional feature selection algorithms require a large number of labeled training instances to find out the most informative subset of features. However, in many real-world applications, the labeled data are often difficult, expensive or time-consuming to obtain. Recently, several semi-supervised feature selection algorithms were proposed, which aim at doing feature selection with the help of some unlabeled data. But such methods assume the labeled and unlabeled data are under the same data distribution. In this paper, we propose a new framework named transferred feature selection (TFS), which uses out-of-domain labeled data to alleviate the lack of same-distribution labeled training data. The out-of-domain data are labeled but have different distributions with the same-distribution data, so most supervised or semi-supervised feature selection algorithms fail to work well with them. The key idea of TFS is to transfer knowledge from the out-of-domain instances to select a feature subset that can yield high prediction accuracy. The framework is then implemented by k-NN method. Analysis and experiments show that TFS can effectively exploit the out-of-domain instances to improve the performance of feature selection.
Keywords :
knowledge based systems; pattern recognition; k-NN method; knowledge transfer; labeled training; semi-supervised feature selection; transferred feature selection; Application software; Bismuth; Computer science; Conferences; Data mining; Labeling; Machine learning; Sun; Testing; Training data;
Conference_Titel :
Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-5384-9
Electronic_ISBN :
978-0-7695-3902-7
DOI :
10.1109/ICDMW.2009.102