• DocumentCode
    3414334
  • Title

    Biclustering Gene Expression Data Using MSR Difference Threshold

  • Author

    Das, Shyama ; Idicula, Sumam Mary

  • Author_Institution
    Dept. of Comput. Sci., Cochin Univ. of Sci. & Technol., Kochi, India
  • fYear
    2009
  • fDate
    18-20 Dec. 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Biclustering is simultaneous clustering of both rows and columns of a data matrix. A measure called mean squared residue (MSR) is used to simultaneously evaluate the coherence of rows and columns within a submatrix. In this paper a novel algorithm is developed for biclustering gene expression data using the newly introduced concept of MSR difference threshold. In the first step high quality bicluster seeds are generated using K-means clustering algorithm. Then more genes and conditions (node) are added to the bicluster. Before adding a node the MSR X of the bicluster is calculated. After adding the node again the MSR Y is calculated. The added node is deleted if Y minus X is greater than MSR difference threshold or if Y is greater than MSR threshold which depends on the dataset. The MSR difference threshold is different for gene list and condition list and it depends on the dataset also. Proper values should be identified through experimentation in order to obtain biclusters of high quality. The results obtained on bench mark dataset clearly indicate that this algorithm is better than many of the existing biclustering algorithms.
  • Keywords
    biology computing; data mining; genetics; pattern clustering; K-means clustering algorithm; biclustering gene expression data; data mining; mean squared residue difference threshold; Biological systems; Clustering algorithms; Computer science; Data mining; Gene expression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    India Conference (INDICON), 2009 Annual IEEE
  • Conference_Location
    Gujarat
  • Print_ISBN
    978-1-4244-4858-6
  • Electronic_ISBN
    978-1-4244-4859-3
  • Type

    conf

  • DOI
    10.1109/INDCON.2009.5409395
  • Filename
    5409395