DocumentCode
2138205
Title
Diabetes Data Analysis and Prediction Model Discovery Using RapidMiner
Author
Han, Jianchao ; Rodriguez, Juan Carlos ; Beheshti, Mohsen
Author_Institution
Dept. of Comput. Sci., California Statement Univ. Dominguez Hills, CA, USA
Volume
3
fYear
2008
fDate
13-15 Dec. 2008
Firstpage
96
Lastpage
99
Abstract
Data mining techniques have been extensively applied in bioinformatics to analyze biomedical data. In this paper, we choose the Rapid-I¿s RapidMiner as our tool to analyze a Pima Indians Diabetes Data Set, which collects the information of patients with and without developing diabetes. The discussion follows the data mining process. The focus will be on the data preprocessing, including attribute identification and selection, outlier removal, data normalization and numerical discretization, visual data analysis, hidden relationships discovery, and a diabetes prediction model construction.
Keywords
biochemistry; bioinformatics; data analysis; data mining; diseases; medical information systems; Pima Indians Diabetes Data Set; RapidMiner; bioinformatics; biomedical data analysis; data mining techniques; data normalization; data preprocessing; diabetes data analysis; diabetes prediction model construction; diabetes prediction model discovery; numerical discretization; patient information; visual data analysis; Bioinformatics; Data analysis; Data mining; Data preprocessing; Diabetes; Humans; Java; Open source software; Predictive models; Pregnancy; Data mining; decision tree; diabetes data; prediction modeling;
fLanguage
English
Publisher
ieee
Conference_Titel
Future Generation Communication and Networking, 2008. FGCN '08. Second International Conference on
Conference_Location
Hainan Island
Print_ISBN
978-0-7695-3431-2
Type
conf
DOI
10.1109/FGCN.2008.226
Filename
4734287
Link To Document