Title :
Genomic profiling by machine learning
Author :
Zhang, Zhiping ; Lin, Honghuang
Author_Institution :
Tongji Sch. of Pharmacy, Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
Genome-wide association studies (GWAS) have identified a number of genetic mutations associated with various diseases. The inclusion of SNPs from GWAS has shown some benefits of improving traditional risk prediction. However, the contribution of genomic profiling to disease classification is unclear yet. Here we present a systematic analysis of the effects of genomic profiling by machine learning. Our study suggests that the odds ratio plays an essential role in the disease classification. In order to explain the majority of variations in a typical case/control study, at least 100 SNPs with effect size of 1.5 and 20% frequency have to be included. The result indicates that more disease-related variants are yet to be identified to obtain meaningful genomic profiling. SVM is capable to recognize the classification pattern with relatively less samples and more resistant to noises.
Keywords :
biology computing; genomics; learning (artificial intelligence); pattern classification; support vector machines; SVM; disease classification; genetic mutation; genome-wide association studies; genomic profiling; machine learning; support vector machines; Bioinformatics; Diseases; Genomics; Kernel; Support vector machines; Vectors;
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
DOI :
10.1109/BIBMW.2011.6112449