DocumentCode
2429950
Title
Development of an extended robust data mining (ERDM) model
Author
Yang, Le ; Shin, Sangmun ; Choi, Yongsun ; Park, Kyungjin ; Kaewkuekool, Sittichai ; Chantrasa, Ruephuwan ; Lila, Banhan
Author_Institution
Inje Univ., Kimhae
fYear
2007
fDate
17-20 Oct. 2007
Firstpage
1523
Lastpage
1528
Abstract
Most data mining (DM) methods reviewed in literature for the factor selection may obtain a number of input factors associated with the interesting response without providing the detailed information, such as relationship between the input factors and response, statistical inferences, and analysis. These DM methods also may not discuss the robustness of solutions, either by considering data preprocesses for outliers and missing values, or by considering uncontrollable noise factors. In order to address these problems, we propose an extended robust data mining (ERDM) model. The main concerns of this model are three-fold. The proposed ERDM conducts outlier test and expectation maximum (EM) algorithm to carry out the data preprocess. The proposed ERDM then reduces the dimensionality to find the significant factors among a large number of input factors using correlation-based feature selection (CBFS) method and best first search (BFS) algorithm. Finally, the proposed model utilizes the theory of robust design to handle the noise factors using the concept of surrogate variable and the response surface methodology (RSM).
Keywords
correlation methods; data mining; data reduction; expectation-maximisation algorithm; feature extraction; response surface methodology; statistical testing; tree searching; best first search algorithm; correlation-based feature selection method; data dimensionality reduction; expectation maximum algorithm; extended robust data mining model; noise factor selection; outlier test; response surface methodology; statistical inference; surrogate variable concept; Data analysis; Data engineering; Data mining; Delta modulation; Educational technology; Electronic mail; Filters; Information analysis; Noise robustness; Testing; Correlation-Based Feature Selection (CBFS); Data Mining (DM); Expectation Maximization (EM) Algorithm; Response Surface Methodology (RSM); Robust Design (RD);
fLanguage
English
Publisher
ieee
Conference_Titel
Control, Automation and Systems, 2007. ICCAS '07. International Conference on
Conference_Location
Seoul
Print_ISBN
978-89-950038-6-2
Electronic_ISBN
978-89-950038-6-2
Type
conf
DOI
10.1109/ICCAS.2007.4406581
Filename
4406581
Link To Document