• DocumentCode
    253232
  • Title

    A new approach for data cleaning process

  • Author

    Krishnamoorthy, R. ; Kumar, Sahoo Subhendu ; Neelagund, Basavaraj

  • Author_Institution
    Dept. of CSE, Anna Univ., Chennai, India
  • fYear
    2014
  • fDate
    9-11 May 2014
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In this paper, we introduced a new approach called Effective Data Cleaning (EDC) is presented. The proposed EDC technique is aimed to identify the relevant and irrelevant instance from the large data set through the degree of the missing value, and it reconstructs the missed value in relevant instance through its closest instance within the instance set. The EDC technique is consist of two methods Identify Relevant Instance (IRI) and Reconstruct Missing Value (RMV). The IRI method is identifying the relevant and irrelevant instance belongs to the large instance set through the degree of the missing value of each instance in the instance set, and the RMV method can reconstruct the missing value in the relevant instance through its closest instance based on the distance metric. Experiment result shows, that the proposed EDC technique is simple and effective for identifying the relevant and irrelevant instance, and reconstruct the missing values in the relevant instance through the closest instance with higher similarity.
  • Keywords
    data handling; EDC technique; IRI method; RMV method; closest instance; distance metric; effective data cleaning process; identify relevant instance; instance set; irrelevant instance; reconstruct missing value; Data mining; Measurement; Effective Data Cleaning (EDC); Identify Relevant Instance (IRI) and Reconstruct Missing Value (RMV);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Recent Advances and Innovations in Engineering (ICRAIE), 2014
  • Conference_Location
    Jaipur
  • Print_ISBN
    978-1-4799-4041-7
  • Type

    conf

  • DOI
    10.1109/ICRAIE.2014.6909249
  • Filename
    6909249