• DocumentCode
    1784986
  • Title

    Meta-classification model for diabetes onset forecast: A proof of concept

  • Author

    Nnamoko, N.A. ; Arshad, F.N. ; England, D. ; Vora, Jiten

  • Author_Institution
    Sch. of Comput. & Math. Sci., Liverpool John Moores Univ., Liverpool, UK
  • fYear
    2014
  • fDate
    2-5 Nov. 2014
  • Firstpage
    50
  • Lastpage
    56
  • Abstract
    We propose a robust diabetes prediction model by examining how predictions from several learning algorithms, performing the same task, can be exploited to yield a higher performance than the best individual learning algorithm. The task was to forecast the onset of non-insulin dependent diabetes within a five year period using previous vital sign examination information. Experimental data is a 768 × 9 array arranged as row vectors, each with observed input in all but the last column which contains a single vector of output. Five well-known models were trained with associated learning algorithms (Sequential Minimal Optimization (SMO), Radial Basis Function (RBF), C4.5, Naïve Bayes and RIPPER) on the same dataset, and performance compared using Accuracy, Receiver Operating Characteristics area (aROC) and Speed as metrics. After comparison, a combiner (Meta) model, using a simple Logistic Regression algorithm, was trained to make a final prediction using outputs of the best and worst performing algorithms (in the order Accuracy - aROC - Speed) as additional inputs. C4.5 had the best performance with Accuracy of 77.9% and aROC of 83.1%. The RBF gave the lowest performance with Accuracy of 73.6% and aROC of 80.5%. The Meta model achieved a classification accuracy of 77.0% with aROC of 84.9%. The slight decline in Accuracy was because we used aROC (not Accuracy) as the performance metric during selection.
  • Keywords
    Bayes methods; classification; decision trees; diseases; feature extraction; knowledge based systems; learning (artificial intelligence); medical diagnostic computing; meta data; optimisation; patient diagnosis; radial basis function networks; regression analysis; sensitivity analysis; C4.5 method; RBF method; RIPPER method; SMO method; aROC metric; accuracy metric; classification accuracy; classifier selection; combiner model; diabetes onset forecast; experimental data array; learning algorithm performance; logistic regression algorithm; meta-classification model; metamodel; model training; naive Bayes method; noninsulin dependent diabetes; performance metric; radial basis function; receiver operating characteristics area metric; robust diabetes prediction model; row vector; sequential minimal optimization method; single output vector; speed metric; time 5 year; vital sign examination information; Accuracy; Classification algorithms; Decision trees; Diabetes; Prediction algorithms; Predictive models; Training; Decision Tree; Diabetes; Naïve Bayes; Neural Network; Rule-based classifier; Support Vector Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
  • Conference_Location
    Belfast
  • Type

    conf

  • DOI
    10.1109/BIBM.2014.6999247
  • Filename
    6999247