• DocumentCode
    2509410
  • Title

    Gene Expression Analysis Using Clustering

  • Author

    Dhiraj, Kumar ; Rath, Santanu Kumar ; Pandey, Abhishek

  • Author_Institution
    Dept of Comput. Sci. & Eng., Nat. Inst. of Technol. Rourkela, Rourkela, India
  • fYear
    2009
  • fDate
    11-13 June 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. In this paper, k-means clustering algorithm has been extensively studied for gene expression analysis. Since our purpose is to demonstrate the effectiveness of the k-means algorithm for a wide variety of data sets, we have chosen two pattern recognition data and thirteen microarray data sets with both overlapping and non-overlapping cluster boundaries, where the number of features/genes ranges from 4 to 7129 and number of sample ranges from 32 to 683. The number of clusters ranges from two to eleven. We use the clustering error rate (or, clustering accuracy) as evaluation metrics to measure the performance of k-means algorithm.
  • Keywords
    data mining; genetics; lab-on-a-chip; medical computing; pattern clustering; biomedical industry; data mining; gene expression analysis; k-means clustering algorithm; microarray data sets; nonoverlapping cluster boundaries; overlapping cluster boundaries; pattern recognition; Breast; Cancer; Clustering algorithms; Clustering methods; Fungi; Gene expression; Iris; Lungs; Partitioning algorithms; Pattern recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-2901-1
  • Electronic_ISBN
    978-1-4244-2902-8
  • Type

    conf

  • DOI
    10.1109/ICBBE.2009.5162877
  • Filename
    5162877