Finding Rule Groups to Classify High Dimensional Gene Expression Datasets

Author

An, Jiyuan ; Chen, Yi-Ping Phoebe

Author_Institution

Fac. of Sci. & Technol., Deakin Univ.

Volume

1

fYear

0

fDate

0-0 0

Firstpage

1196

Lastpage

1199

Abstract

Microarray data provides quantitative information about the transcription profile of cells. To analyze microarray datasets, methodology of machine learning has increasingly attracted bioinformatics researchers. Some approaches of machine learning are widely used to classify and mine biological datasets. However, many gene expression datasets are extremely high dimensionality, traditional machine learning methods can not be applied effectively and efficiently. This paper proposes a robust algorithm to find out rule groups to classify gene expression datasets. Unlike the most classification algorithms, which select dimensions (genes) heuristically to form rules groups to identify classes such as cancerous and normal tissues, our algorithm guarantees finding out best-k dimensions (genes), which are most discriminative to classify samples in different classes, to form rule groups for the classification of expression datasets. Our experiments show that the rule groups obtained by our algorithm have higher accuracy than that of other classification approaches

Keywords

biology computing; genetics; learning (artificial intelligence); pattern classification; best-k dimension; biological dataset classification; biological dataset mining; high dimensional gene expression dataset classification; microarray data; microarray dataset analysis; robust algorithm; rule group; transcription cell profile; Australia Council; Bioinformatics; Classification algorithms; Data analysis; Decision trees; Gene expression; Learning systems; Machine learning; Machine learning algorithms; Robustness;

fLanguage

English

Publisher

ieee

Conference_Titel

Pattern Recognition, 2006. ICPR 2006. 18th International Conference on

Conference_Location

Hong Kong

ISSN

1051-4651

Print_ISBN

0-7695-2521-0

Type

conf

DOI

10.1109/ICPR.2006.564

Filename

1699104