DocumentCode :
1513586
Title :
Fuzzy–Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data
Author :
Maji, Pradipta
Author_Institution :
Machine Intell. Unit, Indian Stat. Inst., Kolkata, India
Volume :
41
Issue :
1
fYear :
2011
Firstpage :
222
Lastpage :
233
Abstract :
One of the major tasks with gene expression data is to find groups of coregulated genes whose collective expression is strongly associated with sample categories. In this regard, a new clustering algorithm, termed as fuzzy-rough supervised attribute clustering (FRSAC), is proposed to find such groups of genes. The proposed algorithm is based on the theory of fuzzy-rough sets, which directly incorporates the information of sample categories into the gene clustering process. A new quantitative measure is introduced based on fuzzy-rough sets that incorporates the information of sample categories to measure the similarity among genes. The proposed algorithm is based on measuring the similarity between genes using the new quantitative measure, whereby redundancy among the genes is removed. The clusters are refined incrementally based on sample categories. The effectiveness of the proposed FRSAC algorithm, along with a comparison with existing supervised and unsupervised gene selection and clustering algorithms, is demonstrated on six cancer and two arthritis data sets based on the class separability index and predictive accuracy of the naive Bayes´ classifier, the K-nearest neighbor rule, and the support vector machine.
Keywords :
biotechnology; fuzzy systems; genetics; pattern clustering; rough set theory; FRSAC algorithm; K-nearest neighbor rule; arthritis data; cancer data; coregulated gene; fuzzy rough supervised attribute clustering algorithm; gene clustering process; gene expression data; gene selection; gene similarity; microarray data; naive Bayes classifier; predictive accuracy; quantitative measure; sample category; support vector machine; Accuracy; Arthritis; Cancer; Classification algorithms; Clustering algorithms; Clustering methods; Gene expression; Redundancy; Support vector machine classification; Support vector machines; Attribute clustering; classification; gene selection; microarray analysis; rough sets; Algorithms; Arthritis; Artificial Intelligence; Bayes Theorem; Cluster Analysis; Computational Biology; Databases, Genetic; Female; Fuzzy Logic; Humans; Male; Neoplasms; Oligonucleotide Array Sequence Analysis;
fLanguage :
English
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
1083-4419
Type :
jour
DOI :
10.1109/TSMCB.2010.2050684
Filename :
5483124
Link To Document :
بازگشت