• DocumentCode
    2429950
  • Title

    Development of an extended robust data mining (ERDM) model

  • Author

    Yang, Le ; Shin, Sangmun ; Choi, Yongsun ; Park, Kyungjin ; Kaewkuekool, Sittichai ; Chantrasa, Ruephuwan ; Lila, Banhan

  • Author_Institution
    Inje Univ., Kimhae
  • fYear
    2007
  • fDate
    17-20 Oct. 2007
  • Firstpage
    1523
  • Lastpage
    1528
  • Abstract
    Most data mining (DM) methods reviewed in literature for the factor selection may obtain a number of input factors associated with the interesting response without providing the detailed information, such as relationship between the input factors and response, statistical inferences, and analysis. These DM methods also may not discuss the robustness of solutions, either by considering data preprocesses for outliers and missing values, or by considering uncontrollable noise factors. In order to address these problems, we propose an extended robust data mining (ERDM) model. The main concerns of this model are three-fold. The proposed ERDM conducts outlier test and expectation maximum (EM) algorithm to carry out the data preprocess. The proposed ERDM then reduces the dimensionality to find the significant factors among a large number of input factors using correlation-based feature selection (CBFS) method and best first search (BFS) algorithm. Finally, the proposed model utilizes the theory of robust design to handle the noise factors using the concept of surrogate variable and the response surface methodology (RSM).
  • Keywords
    correlation methods; data mining; data reduction; expectation-maximisation algorithm; feature extraction; response surface methodology; statistical testing; tree searching; best first search algorithm; correlation-based feature selection method; data dimensionality reduction; expectation maximum algorithm; extended robust data mining model; noise factor selection; outlier test; response surface methodology; statistical inference; surrogate variable concept; Data analysis; Data engineering; Data mining; Delta modulation; Educational technology; Electronic mail; Filters; Information analysis; Noise robustness; Testing; Correlation-Based Feature Selection (CBFS); Data Mining (DM); Expectation Maximization (EM) Algorithm; Response Surface Methodology (RSM); Robust Design (RD);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control, Automation and Systems, 2007. ICCAS '07. International Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    978-89-950038-6-2
  • Electronic_ISBN
    978-89-950038-6-2
  • Type

    conf

  • DOI
    10.1109/ICCAS.2007.4406581
  • Filename
    4406581