DocumentCode
3061060
Title
Feature Extraction from Microarray Expression Data by Integration of Semantic Knowledge
Author
Cho, Young-Rae ; Xu, Xian ; Hwang, Woochang ; Zhang, Aidong
Author_Institution
State Univ. of New York at Buffalo, Buffalo
fYear
2007
fDate
13-15 Dec. 2007
Firstpage
606
Lastpage
611
Abstract
Microarray techniques give biologists first peek into the molecular states of living tissues. Previous studies have proven that it is feasible to build sample classifiers using the gene expressional profiles. To build an effective sample classifier, dimension reduction process is necessary since classic pattern recognition algorithms do not work well in high dimensional space. In this paper, we present a novel feature extraction algorithm based on the concept of virtual genes by integrating microarray expression data sets with domain knowledge embedded in gene ontology (GO) annotations. We define semantic similarity to measure the functional associations between two genes using the annotation on each GO term. We then identify the groups of genes, called virtual genes, that potentially interact with each other for a biological function. The correlation in gene expression levels of virtual genes can be used to build a sample classifier. For a colon cancer data set, the integration of microarray expression data with GO annotations significantly improves the accuracy of sample classification by more than 10%.
Keywords
biological tissues; cancer; feature extraction; genetic engineering; medical image processing; ontologies (artificial intelligence); pattern classification; semantic Web; virtual reality; biological function; colon cancer; feature extraction; gene expressional profiles; gene ontology annotations; living tissues; microarray expression data; molecular states; pattern classification; pattern recognition algorithms; semantic knowledge integration; virtual genes; Application software; Computer science; Data analysis; Data engineering; Feature extraction; Gene expression; Knowledge engineering; Machine learning; Ontologies; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location
Cincinnati, OH
Print_ISBN
978-0-7695-3069-7
Type
conf
DOI
10.1109/ICMLA.2007.10
Filename
4457296
Link To Document