DocumentCode :
268647
Title :
Extending Association Rule Summarization Techniques to Assess Risk of Diabetes Mellitus
Author :
Simon, György J. ; Caraballo, Pedro J. ; Therneau, Terry M. ; Cha, Steven S. ; Castro, M. Regina ; Li, Peter W.
Author_Institution :
Inst. for Health Inf., Univ. of Minnesota, Minneapolis, MN, USA
Volume :
27
Issue :
1
fYear :
2015
fDate :
Jan. 2015
Firstpage :
130
Lastpage :
141
Abstract :
Early detection of patients with elevated risk of developing diabetes mellitus is critical to the improved prevention and overall clinical management of these patients. We aim to apply association rule mining to electronic medical records (EMR) to discover sets of risk factors and their corresponding subpopulations that represent patients at particularly high risk of developing diabetes. Given the high dimensionality of EMRs, association rule mining generates a very large set of rules which we need to summarize for easy clinical use. We reviewed four association rule set summarization techniques and conducted a comparative evaluation to provide guidance regarding their applicability, strengths and weaknesses. We proposed extensions to incorporate risk of diabetes into the process of finding an optimal summary. We evaluated these modified techniques on a real-world prediabetic patient cohort. We found that all four methods produced summaries that described subpopulations at high risk of diabetes with each method having its clear strength. For our purpose, our extension to the Buttom-Up Summarization (BUS) algorithm produced the most suitable summary. The subpopulations identified by this summary covered most high-risk patients, had low overlap and were at very high risk of diabetes.
Keywords :
data mining; electronic health records; patient diagnosis; risk management; BUS algorithm; EMR; association rule mining; association rule summarization techniques; buttom-up summarization; diabetes mellitus risk assessment; electronic medical records; high-risk patients; prediabetic patient cohort; Clustering; Data mining; Database Applications; Database Management; Information Technology and Systems; Mathematics of Computing; Probability and Statistics; Statistical computing; Survival analysis; and association rules; association rule summarization; association rules; classification; survival analysis;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2013.76
Filename :
6514877
Link To Document :
بازگشت