Author/Authors :
Lim, Jayeon Department of Applied Statistics - Konkuk University - Seoul, Republic of Korea , Bang, So Youn Department of Data Science - Konkuk University - Seoul, Republic of Korea , Kim, Jiyeon Department of Statistics - Keimyung University - Daegu, Republic of Korea , Park, Cheolyong Department of Statistics - Keimyung University - Daegu, Republic of Korea , Cho, JunSang Industry-University Cooperation Foundation - Konkuk University - Seoul, Republic of Korea , Kim, SungHwan Department of Applied Statistics - Konkuk University - Seoul, Republic of Korea
Abstract :
As a large amount of genetic data are accumulated, an effective analytical method and a significant interpretation are required.
Recently, various methods of machine learning have emerged to process genetic data. In addition, machine learning analysis tools
using statistical models have been proposed. In this study, we propose adding an integrated layer to the deep learning structure,
which would enable the effective analysis of genetic data and the discovery of significant biomarkers of diseases. We conducted a
simulation study in order to compare the proposed method with metalogistic regression and meta-SVM methods. The objective
function with lasso penalty is used for parameter estimation, and the Youden J index is used for model comparison. The
simulation results indicate that the proposed method is more robust for the variance of the data than metalogistic regression and
meta-SVM methods. We also conducted real data (breast cancer data (TCGA)) analysis. Based on the results of gene set enrichment analysis, we obtained that TCGA multiple omics data involve significantly enriched pathways which contain information related to breast cancer. Therefore, it is expected that the proposed method will be helpful to discover biomarkers.
Keywords :
Deep , Differentially , DE , Biomarkers