DocumentCode
2131023
Title
Genetic Algorithm and Data Mining Techniques for Design Selection in Databases
Author
Koukouvinos, Christos ; Parpoula, Christina ; Simos, Dimitris E.
Author_Institution
Dept. of Math., Nat. Tech. Univ. of Athens, Athens, Greece
fYear
2013
fDate
2-6 Sept. 2013
Firstpage
743
Lastpage
746
Abstract
Nowadays, variable selection is fundamental to large dimensional statistical modelling problems, since large databases exist in diverse fields of science. In this paper, we benefit from the use of data mining tools and experimental designs in databases in order to select the most relevant variables for classification in regression problems in cases where observations and labels of a real-world dataset are available. Specifically, this study is of particular interest to use health data to identify the most significant variables containing all the necessary important information for classification and prediction of new data with respect to a certain effect (survival or death). The main goal is to determine the most important variables using methods that arise from the field of design of experiments combined with algorithmic concepts derived from data mining and metaheuristics. Our approach seems promising, since we are able to retrieve an optimal plan using only 6 runs of the available 8862 runs.
Keywords
data mining; design of experiments; genetic algorithms; health care; medical information systems; pattern classification; regression analysis; support vector machines; very large databases; association rule mining; data classification; data mining techniques; data prediction; design selection; design-of-experiments; genetic algorithm; health data; large databases; large dimensional statistical modelling problems; metaheuristic algorithms; regression problems; support vector machines; variable selection; Algorithm design and analysis; Association rules; Databases; Genetic algorithms; Input variables; Support vector machines; association rule mining; design of experiments; feature selection; large dimensional data; metaheuristics; sensitivity analysis; support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Availability, Reliability and Security (ARES), 2013 Eighth International Conference on
Conference_Location
Regensburg
Type
conf
DOI
10.1109/ARES.2013.98
Filename
6657314
Link To Document