Feature selection from huge feature sets

Author

Bins, José ; Draper, Bruce A.

Author_Institution

Fac. de Inf., Pontificia Univ. Catolica, Porto Alegre, Brazil

Volume

2

fYear

2001

fDate

2001

Firstpage

159

Abstract

The number of features that can be completed over an image is, for practical purposes, limitless. Unfortunately, the number of features that can be computed and exploited by most computer vision systems is considerably less. As a result, it is important to develop techniques for selecting features from very large data sets that include many irrelevant or redundant features. This work addresses the feature selection problem by proposing a three-step algorithm. The first step uses a variation of the well known Relief algorithm to remove irrelevance; the second step clusters features using K-means to remove redundancy; and the third step is a standard combinatorial feature selection algorithm. This three-step combination is shown to be more effective than standard feature selection algorithms for large data sets with lots of irrelevant and redundant features. It is also shown to he no worse than standard techniques for data sets that do not have these properties. Finally, we show a third experiment in which a data set with 4096 features is reduced to 5% of its original size with very little information loss

Keywords

computer vision; feature extraction; pattern clustering; Relief algorithm; clusters; computer vision; feature selection; huge feature sets; redundancy; Biometrics; Computer science; Computer vision; Data mining; Object recognition; Particle measurements; Principal component analysis; Probes; Size measurement; Supervised learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on

Conference_Location

Vancouver, BC

Print_ISBN

0-7695-1143-0

Type

conf

DOI

10.1109/ICCV.2001.937619

Filename

937619