• DocumentCode
    3322159
  • Title

    DHC: a density-based hierarchical clustering method for time series gene expression data

  • Author

    Jiang, Daxin ; Pei, Jian ; Zhang, Aidong

  • Author_Institution
    Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA
  • fYear
    2003
  • fDate
    10-12 March 2003
  • Firstpage
    393
  • Lastpage
    400
  • Abstract
    Clustering the time series gene expression data is an important task in bioinformatics research and biomedical applications. Recently, some clustering methods have been adapted or proposed. However, some concerns still remain, such as the robustness of the mining methods, as well as the quality and the interpretability of the mining results. In this paper, we tackle the problem of effectively clustering time series gene expression data by proposing algorithm DHC, a density-based, hierarchical clustering method. We use a density-based approach to identify the clusters such that the clustering results are of high quality and robustness. Moreover, the mining result is in the form of a density tree, which uncovers the embedded clusters in a data set. The inner-structures, the borders and the outliers of the clusters can be further investigated using the attraction tree, which is an intermediate result of the mining. By these two trees, the internal structure of the data set can be visualized effectively. Our empirical evaluation using some real-world data sets show that the method is effective, robust and scalable. It matches the ground truth provided by bioinformatics experts very well in the sample data sets.
  • Keywords
    DNA; biology computing; data mining; genetics; time series; trees (mathematics); attraction tree; bioinformatics research; biomedical applications; density tree; density-based hierarchical clustering method; effective robust scalable method; mining methods robustness; time series gene expression data; Algorithm design and analysis; Bioinformatics; Clustering algorithms; Clustering methods; DNA; Data mining; Gene expression; Multimedia databases; Robustness; Transaction databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering, 2003. Proceedings. Third IEEE Symposium on
  • Print_ISBN
    0-7695-1907-5
  • Type

    conf

  • DOI
    10.1109/BIBE.2003.1188978
  • Filename
    1188978