Title :
The application of random forest in genetic case-control studies
Author :
Mao, Weidong ; Mao, Jinghe
Author_Institution :
Fac. of Math. & Comput. Sci., Virginia State Univ., Petersburg, VA
Abstract :
High-throughput single nucleotide polymorphism (SNP) genotyping technologies make massive genotype data, with a large number of individuals, publicly available. Accessibility of genetic data makes genome-wide association studies for complex diseases possible. The susceptibility to complex diseases can be predicted through the analysis of the genetic data and prospective patients can be helped to make informed decisions. With the development of DNA microarray technique, it is possible to access the human genetic information related to specific diseases. This paper uses a combinatorial method to analyze the genetic case-control data for Crohnpsilas disease and search disease-associated factors for given samples. A random forest based method has been applied to publicly available genotype data on Crohnpsilas disease for association study and achieved a promising result.
Keywords :
DNA; cellular biophysics; diseases; genetics; molecular biophysics; random processes; Crohn disease; DNA microarray; complex diseases; disease-associated factors; genetic case-control studies; genome-wide association; genotyping; human genetic information; random forest; single nucleotide polymorphism; Bioinformatics; Biomedical engineering; DNA; Data engineering; Diseases; Genetics; Genomics; Humans; Information technology; Proteins;
Conference_Titel :
Information Technology and Applications in Biomedicine, 2008. ITAB 2008. International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4244-2254-8
Electronic_ISBN :
978-1-4244-2255-5
DOI :
10.1109/ITAB.2008.4570522