DocumentCode :
3806786
Title :
Sequential Data Mining: A Comparative Case Study in Development of Atherosclerosis Risk Factors
Author :
Jira Klema;Lenka Novakova;Filip Karel;Olga Stepankova;Filip Zelezny
Author_Institution :
Czech Tech. Univ. in Prague, Prague
Volume :
38
Issue :
1
fYear :
2008
Firstpage :
3
Lastpage :
15
Abstract :
Sequential data represent an important source of potentially new medical knowledge. However, this type of data is rarely provided in a format suitable for immediate application of conventional mining algorithms. This paper summarizes and compares three different sequential mining approaches based, respectively, on windowing, episode rules, and inductive logic programming. Windowing is one of the essential methods of data preprocessing. Episode rules represent general sequential mining, while inductive logic programming extracts first-order features whose structure is determined by background knowledge. The three approaches are demonstrated and evaluated in terms of a case study STULONG. It is a longitudinal preventive study of atherosclerosis where the data consist of a series of long-term observations recording the development of risk factors and associated conditions. The intention is to identify frequent sequential/temporal patterns. Possible relations between the patterns and an onset of any of the observed cardiovascular diseases are also studied.
Keywords :
"Data mining","Atherosclerosis","Logic programming","Data preprocessing","Feature extraction","Cardiovascular diseases","Pattern analysis","Databases","Educational programs","Biomedical engineering"
Journal_Title :
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)
Publisher :
ieee
ISSN :
1094-6977
Type :
jour
DOI :
10.1109/TSMCC.2007.906055
Filename :
4383142
Link To Document :
بازگشت