• DocumentCode
    1785493
  • Title

    Hybrid approach for tuberculosis data classification using optimal centroid selection based clustering

  • Author

    Shukla, M. ; Agarwal, Sankalp

  • Author_Institution
    Dept. of Inf. Technol., Indian Inst. of Inf. Technol., Allahabad, India
  • fYear
    2014
  • fDate
    28-30 May 2014
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Application of classification technique in healthcare is challenging because of high dimensional medical data and of its dynamic nature. The research work here is focused on the study of various approaches for transformation large data into smaller datasets in effective manner so that accurate classification could be performed. Data clustering is a machine learning approach which divides dataset into smaller partitions and having higher intra partition similarity within it and dissimilarity among different partitions. Many clustering algorithm exists for varying nature of dataset and own their advantages as well as limitations as per nature of individual datasets thus there is sufficient scope to explore efficient and new algorithm for clustering based classification. This paper presents a new approach for centroid selection in k-mean algorithm for health datasets which gives better clustering results in comparison to traditional k-mean algorithm. The algorithm is evaluated against tuberculosis dataset and then results are applied to classifier for performance evaluation and results show improvement over previous algorithm.
  • Keywords
    diseases; learning (artificial intelligence); medical computing; pattern classification; pattern clustering; classification technique; clustering algorithm; clustering based classification; data clustering; data transformation; health care; k-mean algorithm; machine learning approach; medical data; optimal centroid selection based clustering; partition dissimilarity; partition similarity; tuberculosis data classification; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Diseases; Information technology; Partitioning algorithms; Centroid selection; Classification; Clustering; Tuberculosis dataset; k-mean;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering and Systems (SCES), 2014 Students Conference on
  • Conference_Location
    Allahabad
  • Print_ISBN
    978-1-4799-4940-3
  • Type

    conf

  • DOI
    10.1109/SCES.2014.6880115
  • Filename
    6880115