• DocumentCode
    3562645
  • Title

    Comparison of K-Means clustering and statistical outliers in reducing medical datasets

  • Author

    Santhanam, T. ; Padmavathi, M.S.

  • Author_Institution
    Dept. of Comput. Sci., D.G. Vaishnav Coll., Chennai, India
  • fYear
    2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Data reduction is a process of reducing the datasets in volume, almost used in all real time applications. Although there are several techniques available, many researchers have used K-Means clustering in reducing the datasets. In this paper, three different methods were used to replace missing values with mean, median and a predicted score; the cleaned datasets were reduced using K-Means clustering and Statistical Outlier detection. This research work compares the data reduction percentage performed by K-Means and Statistical Outliers for all the three methods of imputation. The experimental result proves that, the reduction rate of outliers is less than K-Means clustering.
  • Keywords
    data reduction; medical administrative data processing; pattern clustering; statistical analysis; K-means clustering; medical dataset reduction; statistical outlier detection; Cleaning; Clustering algorithms; Computer science; Data mining; Data models; Diabetes; Medical diagnostic imaging; Data Reduction; K-Means clustering; Missing values; Outliers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Science Engineering and Management Research (ICSEMR), 2014 International Conference on
  • Print_ISBN
    978-1-4799-7614-0
  • Type

    conf

  • DOI
    10.1109/ICSEMR.2014.7043602
  • Filename
    7043602