• DocumentCode
    3724997
  • Title

    The impact of different fold for cross validation of missing values imputation method on hepatitis dataset

  • Author

    Tri Astuti;Hanung Adi Nugroho;Teguh Bharata Adji

  • Author_Institution
    Department of Informatics Engineering, STMIK Amikom Purwokerto, Indonesia
  • fYear
    2015
  • Firstpage
    51
  • Lastpage
    55
  • Abstract
    Hepatitis is a liver disease caused by hepatitis viruses. Nowadays, hepatitis is a global health problem, including in Indonesia. Chronic hepatitis can lead to cirrhosis and liver cancer, therefore early diagnosis is needed. Several research works on development of computer aided systems have been conducted to improve the diagnosis process of hepatitis disease. California Irvine (UCI) machine-learning repository provides hepatitis disease dataset which can be publicly accessed; however, the dataset contains many missing values. The existing of missing values in the dataset may affect the quality of the results analysis. Therefore, it needs to be conducted for handling the missing values. This paper analyses the performance of applying varied number of fold for cross validation of missing values imputation methods. The imputation method is combined with the feature selection method and machine-learning algorithm on the hepatitis dataset. The results that varied fold in k-fold cross validation which applied in the imputation method does not reveal significant advantages.
  • Keywords
    "Viruses (medical)","Computational modeling"
  • Publisher
    ieee
  • Conference_Titel
    Quality in Research (QiR), 2015 International Conference on
  • Print_ISBN
    978-1-4799-6550-2
  • Type

    conf

  • DOI
    10.1109/QiR.2015.7374894
  • Filename
    7374894