DocumentCode :
140151
Title :
Multicategory classification of 11 neuromuscular diseases based on microarray data using support vector machine
Author :
Soo Beom Choi ; Jee Soo Park ; Jai Won Chung ; Tae Keun Yoo ; Deok Won Kim
Author_Institution :
Brain 21 PLUS Project for Med. Sci., Yonsei Univ., Seoul, South Korea
fYear :
2014
fDate :
26-30 Aug. 2014
Firstpage :
3460
Lastpage :
3463
Abstract :
We applied multicategory machine learning methods to classify 11 neuromuscular disease groups and one control group based on microarray data. To develop multicategory classification models with optimal parameters and features, we performed a systematic evaluation of three machine learning algorithms and four feature selection methods using three-fold cross validation and a grid search. This study included 114 subjects of 11 neuromuscular diseases and 31 subjects of a control group using microarray data with 22,283 probe sets from the National Center for Biotechnology Information (NCBI). We obtained an accuracy of 100%, relative classifier information (RCI) of 1.0, and a kappa index of 1.0 by applying the models of support vector machines one-versus-one (SVM-OVO), SVM one-versus-rest (OVR), and directed acyclic graph SVM (DAGSVM), using the ratio of genes between categories to within-category sums of squares (BW) feature selection method. Each of these three models selected only four features to categorize the 12 groups, resulting in a time-saving and cost-effective strategy for diagnosing neuromuscular diseases. In addition, a gene symbol, SPP1 was selected as the top-ranked gene by the BW method. We confirmed relationships between the gene (SPP1) and Duchenne muscular dystrophy (DMD) from a previous study. With our models as clinically helpful tools, neuromuscular diseases could be classified quickly using a computer, thereby giving a time-saving, cost-effective, and accurate diagnosis.
Keywords :
diseases; feature selection; genetics; genomics; learning (artificial intelligence); medical diagnostic computing; medical disorders; muscle; neurophysiology; patient diagnosis; support vector machines; BW method; DAGSVM; DMD; Duchenne muscular dystrophy; NCBI; National Center for Biotechnology Information; OVR; RCI; SPP1; SVM one-versus-rest; SVM-OVO; directed acyclic graph SVM; feature selection methods; gene ratio; gene symbol; grid search; kappa index; machine learning algorithms; microarray data; multicategory classification models; multicategory machine learning methods; neuromuscular disease diagnosis; neuromuscular disease groups; optimal features; optimal parameters; probe sets; relative classifier information; support vector machines one-versus-one; systematic evaluation; three-fold cross validation; top-ranked gene; within-category sums of squares feature selection method;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE
Conference_Location :
Chicago, IL
ISSN :
1557-170X
Type :
conf
DOI :
10.1109/EMBC.2014.6944367
Filename :
6944367
Link To Document :
بازگشت