• DocumentCode
    2835805
  • Title

    Error Detection and Uncertainty Modeling for Imprecise Data

  • Author

    He, Dan ; Zhu, Xingquan ; Wu, Xindong

  • Author_Institution
    Dept. of Comput. Sci., Univ. of California Los Angeles, Los Angeles, CA, USA
  • fYear
    2009
  • fDate
    2-4 Nov. 2009
  • Firstpage
    792
  • Lastpage
    795
  • Abstract
    In this paper, we propose a method to derive and model data uncertainty from imprecise data. We view data imprecision and errors as the outcome of the precise data exposed to some uncertain channels, and our scheme is to directly derive the data uncertainty model from imprecise data, such that the derived data uncertainty information may be integrated into the succeeding mining process. To achieve the goal, we propose an expectation maximization (EM) based approach to detect erroneous data entries from the input data. The data uncertainty models are constructed by applying statistical analysis to the detected errors. Experimental results show that the proposed error detection approach can locate data errors and suggest alternative data entry values to improve classifiers built from imprecise data. In addition, the uncertain models derived for each individual attributes are shown to be close to the genuine uncertainty models used to corrupt the data.
  • Keywords
    data mining; expectation-maximisation algorithm; statistical analysis; uncertainty handling; data classification; data mining process; data uncertainty model; expectation maximization based approach; imprecise data error detection approach; statistical analysis; uncertainty modeling; Artificial intelligence; Australia; Computer errors; Computer science; Data mining; Helium; Predictive models; Statistical analysis; USA Councils; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2009. ICTAI '09. 21st International Conference on
  • Conference_Location
    Newark, NJ
  • ISSN
    1082-3409
  • Print_ISBN
    978-1-4244-5619-2
  • Electronic_ISBN
    1082-3409
  • Type

    conf

  • DOI
    10.1109/ICTAI.2009.9
  • Filename
    5364419