Title :
Sample selection of microarray data using rough-fuzzy based approach
Author :
Paul, Amit ; Sil, Jaya
Author_Institution :
Comput. Sci. & Eng. Dept., Gurunanak Inst. of Technol., Sodpur, India
Abstract :
Though DNA microarray technology simultaneously measures the expression levels of thousands of genes, only a few underlying gene features may account for significant data variation in gene classification problems. Selection of features from huge data set is difficult and so dimension reduction of gene expression data set is essential in order to determining important features, which play key role in predicting an outcome. Rough set theory (RST) has been used recently for dimension reduction of data, however, the existing methods are inadequate to finding minimal reduct. The paper proposes a RST based technique, applied on gene expression data for dimension reduction by obtaining single reduct in one pass. The gene expression data are discretized using linguistic terms with proper semantics and represented by fuzzy sets. The discretized values are calculated using Gaussian membership function with varied mean and standard deviation in order to eliminate the ambiguity between different linguistic terms. The genes are classified using linguistic decision attribute values based on the frequency of gene expression data. Discritization and classification of gene expression data are performed simultaneously, which significantly reduces time complexity of the procedure. Thus, the proposed framework selects the most significant samples for gene classification, resulting dimension reduction. The Proposed method produces output, which exhibits no variation with experimental microarray gene information unlike other existing methods.
Keywords :
DNA; Gaussian processes; biology computing; fuzzy set theory; pattern classification; rough set theory; DNA microarray technology; Gaussian membership function; fuzzy sets; gene expression data; linguistic decision attribute values; microarray data selection; rough set theory; Biomedical measurements; Clustering algorithms; Computer science; DNA; Data analysis; Data engineering; Gene expression; Genetics; Proteins; Set theory;
Conference_Titel :
Nature & Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4244-5053-4
DOI :
10.1109/NABIC.2009.5393852