• DocumentCode
    678137
  • Title

    Revisiting Link-Based Cluster Ensembles for Microarray Data Classification

  • Author

    Iam-On, Natthakan ; Boongoen, Tossapon

  • Author_Institution
    Sch. of Inf. Technol., Mae Fah Luang Univ., Chiang Rai, Thailand
  • fYear
    2013
  • fDate
    13-16 Oct. 2013
  • Firstpage
    4543
  • Lastpage
    4548
  • Abstract
    Cancer has been identified as the leading cause of death. It is predicted that around 20-26 million people will be diagnosed with cancer by 2020. With this alarming rate, there is an urgent need for a more effective methodology to understand, prevent and cure cancer. Micro array technology provides a useful basis of achieving this ultimate goal. In particular to cancer research, it has become almost routine to create gene expression profiles, which can discriminate patients into good and poor prognosis groups, and identify possible tumor subtypes. This classification or predictive model offers a useful tool for individualized treatment of disease. However, the accuracy of existing classifiers have been constrained by the curse of dimensionality typically observed in micro array data. In addition to gene selection, one may transform the original data to another variation, where only key gene components are included. Unlike conventional transformation-based techniques found in the literature, this paper presents a novel method that makes use of cluster ensembles, specifically the summarizing information matrix, as the transformed data for the following classification step. Among different state-of-the-art methods, the link-based cluster ensemble approach (LCE) provides a highly accurate clustering, and thus particularly employed here. The performance of this transformation model is evaluated on published micro array datasets and C4.5, in comparison with benchmark techniques. The findings suggest that the new model can improve the classification accuracy of original data and performs better than the other transformation methods investigated in the empirical study.
  • Keywords
    cancer; data handling; medical computing; pattern classification; pattern clustering; C4.5; LCE; cancer research; dimensionality curse; disease treatment; gene components; gene expression profiles; link-based cluster ensemble approach; link-based cluster ensembles; microarray data classification; predictive model; prognosis groups; published microarray datasets; transformation methods; tumor subtypes; Accuracy; Cancer; Clustering algorithms; Data models; Error analysis; Gene expression; Principal component analysis; classification; cluster ensembles; link-based similarity; microarray data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
  • Conference_Location
    Manchester
  • Type

    conf

  • DOI
    10.1109/SMC.2013.773
  • Filename
    6722528