Title :
Classification of serous ovarian tumors based on microarray data using multicategory support vector machines
Author :
Jee Soo Park ; Soo Beom Choi ; Jai Won Chung ; Sung Woo Kim ; Deok Won Kim
Author_Institution :
Dept. of Med., Yonsei Univ., Seoul, South Korea
Abstract :
Ovarian cancer, the most fatal of reproductive cancers, is the fifth leading cause of death in women in the United States. Serous borderline ovarian tumors (SBOTs) are considered to be earlier or less malignant forms of serous ovarian carcinomas (SOCs). SBOTs are asymptomatic and progression to advanced stages is common. Using DNA microarray technology, we designed multicategory classification models to discriminate ovarian cancer subclasses. To develop multicategory classification models with optimal parameters and features, we systematically evaluated three machine learning algorithms and three feature selection methods using five-fold cross validation and a grid search. The study included 22 subjects with normal ovarian surface epithelial cells, 12 with SBOTs, and 79 with SOCs according to microarray data with 54,675 probe sets obtained from the National Center for Biotechnology Information gene expression omnibus repository. Application of the optimal model of support vector machines one-versus-rest with signal-to-noise as a feature selection method gave an accuracy of 97.3%, relative classifier information of 0.916, and a kappa index of 0.941. In addition, 5 features, including the expression of putative biomarkers SNTN and AOX1, were selected to differentiate between normal, SBOT, and SOC groups. An accurate diagnosis of ovarian tumor subclasses by application of multicategory machine learning would be cost-effective and simple to perform, and would ensure more effective subclass-targeted therapy.
Keywords :
DNA; bioMEMS; biological organs; cancer; feature selection; lab-on-a-chip; learning (artificial intelligence); medical computing; patient diagnosis; support vector machines; DNA microarray technology; SBOT; SOC; biomarker expression; feature selection method; five-fold cross validation; grid search; kappa index; machine learning algorithm; microarray data; multicategory classification model; multicategory support vector machines; normal ovarian surface epithelial cells; ovarian cancer; reproductive cancer; serous borderline ovarian tumors; serous ovarian carcinomas; serous ovarian tumor classification;
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE
Conference_Location :
Chicago, IL
DOI :
10.1109/EMBC.2014.6944360