• DocumentCode
    2138205
  • Title

    Diabetes Data Analysis and Prediction Model Discovery Using RapidMiner

  • Author

    Han, Jianchao ; Rodriguez, Juan Carlos ; Beheshti, Mohsen

  • Author_Institution
    Dept. of Comput. Sci., California Statement Univ. Dominguez Hills, CA, USA
  • Volume
    3
  • fYear
    2008
  • fDate
    13-15 Dec. 2008
  • Firstpage
    96
  • Lastpage
    99
  • Abstract
    Data mining techniques have been extensively applied in bioinformatics to analyze biomedical data. In this paper, we choose the Rapid-I¿s RapidMiner as our tool to analyze a Pima Indians Diabetes Data Set, which collects the information of patients with and without developing diabetes. The discussion follows the data mining process. The focus will be on the data preprocessing, including attribute identification and selection, outlier removal, data normalization and numerical discretization, visual data analysis, hidden relationships discovery, and a diabetes prediction model construction.
  • Keywords
    biochemistry; bioinformatics; data analysis; data mining; diseases; medical information systems; Pima Indians Diabetes Data Set; RapidMiner; bioinformatics; biomedical data analysis; data mining techniques; data normalization; data preprocessing; diabetes data analysis; diabetes prediction model construction; diabetes prediction model discovery; numerical discretization; patient information; visual data analysis; Bioinformatics; Data analysis; Data mining; Data preprocessing; Diabetes; Humans; Java; Open source software; Predictive models; Pregnancy; Data mining; decision tree; diabetes data; prediction modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Future Generation Communication and Networking, 2008. FGCN '08. Second International Conference on
  • Conference_Location
    Hainan Island
  • Print_ISBN
    978-0-7695-3431-2
  • Type

    conf

  • DOI
    10.1109/FGCN.2008.226
  • Filename
    4734287