DocumentCode
1736644
Title
A comparative study of various feature selection techniques in high-dimensional data set to improve classification accuracy
Author
Shroff, Kandarp P. ; Maheta, Hardik H.
Author_Institution
Dept. of Inf. Technol., Dharmsinh Desai Univ., Nadiad, India
fYear
2015
Firstpage
1
Lastpage
6
Abstract
The performance of machine learning algorithm depends on features considered from the dataset. High dimensional dataset degrades the performance of learning algorithm as learning algorithm try to analyze and accommodate all the features. Feature selection technique is used as a pre-processing step to analyze and compress large data set. The main objective of feature selection technique is to identify relevant features and removes redundant features from high dimensional dataset. The main goal of feature selection technique is to reduce dimensions of dataset, improve classification accuracy, reduce computational cost, and better visualization of data. By considering only useful features, the performance of classification algorithm can be improved. To select reduced set of relevant features from set of all features, various search techniques such as complete search, random search and sequential search etc. are used. Each generated subset of features is validated using various evaluation techniques such as filter, wrapper, and hybrid approach. The main aim of any search technique is to generate optimal subset of features. A general methodology of feature selection process is summarized in this paper on the basis of search and evaluation techniques. In this paper, we provide a comprehensive review of the recent developments in feature selection techniques. Classification with appropriate feature selection technique has shown better performance in the field of machine learning.
Keywords
data visualisation; feature selection; learning (artificial intelligence); pattern classification; classification accuracy; data visualization; feature selection techniques; high dimensional dataset; high-dimensional data set; machine learning algorithm; Accuracy; Classification algorithms; Filtering algorithms; Genetic algorithms; Machine learning algorithms; Search problems; Time complexity; Classification; Evaluation Techniques; Feature Selection; High dimensional datasets; Search Techniques;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Communication and Informatics (ICCCI), 2015 International Conference on
Conference_Location
Coimbatore
Print_ISBN
978-1-4799-6804-6
Type
conf
DOI
10.1109/ICCCI.2015.7218098
Filename
7218098
Link To Document