• DocumentCode
    3201421
  • Title

    Missing data estimation on heart disease using Artificial Neural Network and Rough Set Theory

  • Author

    Setiawan, N.A. ; Venkatachalam, P.A. ; Hani, A.F.M.

  • Author_Institution
    Electr. & Electron. Eng. Programme, Univ. Teknol. Petronas, Tronoh
  • fYear
    2007
  • fDate
    25-28 Nov. 2007
  • Firstpage
    129
  • Lastpage
    133
  • Abstract
    The objective of this research is to implement a method for estimating the real missing data in heart disease datasets and to show how it affects the resulting knowledge. Missing data is common problem in knowledge discovery from database (KDD) processes that can lead significant error in extracted knowledge. We use hybridization of artificial neural network and rough set theory (ANNRST) to estimate the real missing data on heart disease from UCI (University of California, Irvine) datasets. ANN with reduced input features is used to estimate the missing data. RST is used to reduce the dimensionality of input features and to extract the knowledge as reducts and rules from heart disease datasets with estimated missing data. RST, decomposition tree, local transfer function classifier (LTF-C) and k-nearest neighbor (k-NN) classifier are used to calculate the accuracy. Comparative study with k-NN estimation, most common attribute value filling and deletion of missing data are made to evaluate the extracted knowledge. ANNRST can be considered as the appropriate estimation method when strong relationship between original complete datasets and estimated datasets is important (the estimated datasets really represent the nature of original complete datasets) as it gives the best accuracy and coverage for almost all the classifiers.
  • Keywords
    cardiology; data mining; database management systems; neural nets; pattern classification; rough set theory; artificial neural network; database; decomposition tree; heart disease; k-nearest neighbor classifier; knowledge discovery; knowledge extraction; local transfer function classifier; missing data estimation; rough set theory; Accuracy; Artificial neural networks; Cardiac disease; Classification tree analysis; Data mining; Estimation theory; Feature extraction; Set theory; Spatial databases; Transfer functions; missing value; neural network; rough set theory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent and Advanced Systems, 2007. ICIAS 2007. International Conference on
  • Conference_Location
    Kuala Lumpur
  • Print_ISBN
    978-1-4244-1355-3
  • Electronic_ISBN
    978-1-4244-1356-0
  • Type

    conf

  • DOI
    10.1109/ICIAS.2007.4658361
  • Filename
    4658361