• DocumentCode
    2753972
  • Title

    A Robust Method for Biological Sequence Clustering

  • Author

    Chen, Wei-Bang ; Zhang, Chengcui

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Univ. of Alabama at Birmingham, AL
  • fYear
    2006
  • fDate
    16-18 Sept. 2006
  • Firstpage
    286
  • Lastpage
    291
  • Abstract
    In this paper, we proposed a two-phase hybrid method for biological sequence clustering, which combines the strengths of the hierarchical agglomerative clustering methods and the partition clustering methods. In phase I, the hybrid method uses the hierarchical agglomerative clustering algorithm to pre-cluster the aligned sequences, while in the second phase it takes the pre-clustering result as the initial partition for the profile hidden Markov models (HMMs) based k-means partition clustering method. Such initial partitions (generated from phase I), as against random initial partitions, are usually more reasonable and thus can avoid the inconsistency problem in the partition clustering methods due to the randomness in initial partitions. In addition, the inaccuracy of the hierarchical agglomerative clustering methods can be compensated by the profile HMM based k-means partition clustering since the latter is model-based and can better describe the dynamic properties of the data in a cluster. Experiments on a molecular sequence dataset demonstrate the effectiveness and the efficiency of the proposed hybrid clustering algorithm
  • Keywords
    biology computing; hidden Markov models; molecular biophysics; pattern clustering; biological sequence clustering; hidden Markov model; hierarchical agglomerative clustering; hybrid clustering algorithm; k-means partition clustering; molecular sequence dataset; Biological system modeling; Biology computing; Buildings; Clustering algorithms; Clustering methods; Hidden Markov models; Iterative algorithms; Iterative methods; Partitioning algorithms; Robustness;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration, 2006 IEEE International Conference on
  • Conference_Location
    Waikoloa Village, HI
  • Print_ISBN
    0-7803-9788-6
  • Type

    conf

  • DOI
    10.1109/IRI.2006.252427
  • Filename
    4018504