مرکز منطقه ای اطلاع رساني علوم و فناوري - Using a clustering similarity measure for feature selection in high dimensional data sets

DocumentCode :

2060494

Title :

Using a clustering similarity measure for feature selection in high dimensional data sets

Author :

Santos, Jorge M. ; Ramos, Sandra

Author_Institution :

Inst. de Eng. Biomedica, Inst. Super. de Eng. do Porto, Porto, Portugal

fYear :

2010

fDate :

Nov. 29 2010-Dec. 1 2010

Firstpage :

900

Lastpage :

905

Abstract :

Feature selection is a very important preprocessing step in data classification. By applying it we are able to reduce the dimensionality of the problem by removing redundant or irrelevant data. High dimensional data sets are becoming usual nowadays specially in bio-informatics, biology, signal processing or text classification, increasing the need for efficient feature selection methods. In this paper we study the applicability of a clustering validation measure, the Adjusted Rand Index (ARI), for this task comparing it with other methods based on statistical tests and on ROC curve. We have performed some experiments that show the validity of the proposed method.

Keywords :

data handling; feature extraction; pattern classification; pattern clustering; statistical analysis; ROC curve; adjusted rand index; clustering similarity measure; data classification; feature selection methods; high dimensional data sets; statistical tests; adjusted rand index; feature selection; high dimensional data sets;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on

Conference_Location :

Cairo

Print_ISBN :

978-1-4244-8134-7

Type :

conf

DOI :

10.1109/ISDA.2010.5687073

Filename :

5687073

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2060494