DocumentCode
3776385
Title
A sequential cosine similarity based feature selection technique for high dimensional datasets
Author
Vimal Kumar Dubey;Amit Kumar Saxena
Author_Institution
Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur, Chattisgarh, India, 495009
fYear
2015
Firstpage
1
Lastpage
5
Abstract
Due to day to day use of information processing in society, the size of the databases has become tremendously high. It has been realized that most of the times, all parameters (called features precisely here) are not required to decide the outcome (or decision) of an instance. Therefore feature selection is an important step in data processing. In this paper, a novel method is presented to select features. In the method, cosine similarity of individual feature of the database with the respective class is computed and kept in an array in descending order. The first feature of this array is combined with rest of the features sequentially one by one. If the classification accuracy of the combination of features increases then the combination is accepted otherwise the responsible features are eliminated from the combination. In this manner all features are tested and a final subset of features is obtained. The results obtained after rigorous experiments on the proposed method on high dimensional databases and comparing with other methods reported so far are encouraging. It is therefore recommended that the proposed method can be applied for high dimensional data processing.
Keywords
"Databases","Feature extraction","Classification algorithms","Testing","Robustness","Brain models"
Publisher
ieee
Conference_Titel
Systems Conference (NSC), 2015 39th National
Type
conf
DOI
10.1109/NATSYS.2015.7489113
Filename
7489113
Link To Document