• DocumentCode
    1594378
  • Title

    Clustering Methods Based on Closest String via Rank Distance

  • Author

    Dinu, L.P. ; Ionescu, R.-T.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Bucharest, Bucharest, Romania
  • fYear
    2012
  • Firstpage
    207
  • Lastpage
    213
  • Abstract
    This paper aims to present two clustering methods based on rank distance. Rank distance has applications in many different fields such as computational linguistics, biology and informatics. Rank distance can be computed fast and benefits from some features of the edit (Levenshtein) distance. In [1] two clustering methods based on rank distance are described. The K-means algorithm uses the median string to represent the centroid of a cluster, while the hierarchical clustering method joins pairs of strings and replaces each pair with the median string. Two similar clustering algorithms are about to be presented in this paper, only that the closest string will be considered instead of the median string. The new clustering algorithms are compared with those presented in [1] and other similar clustering techniques. Experiments using mitochondrial DNA sequences extracted from several mammals are performed to compare the results of the clustering methods. Results demonstrate the clustering performance and the utility of the new algorithms.
  • Keywords
    DNA; bioinformatics; computational linguistics; pattern clustering; string matching; K-means algorithm; biology; closest string; cluster centroid representation; clustering methods; clustering performance; computational linguistics; hierarchical clustering method; informatics; mammals; median string; mitochondrial DNA sequences; rank distance; Algorithm design and analysis; Clustering algorithms; Clustering methods; DNA; Phylogeny; Standards; DNA; DNA applications; DNA sequencing; bioinformatics; closest string; closest substring; clustering; hierarchical clustering; k-means; phylogeny; rank distance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2012 14th International Symposium on
  • Conference_Location
    Timisoara
  • Print_ISBN
    978-1-4673-5026-6
  • Type

    conf

  • DOI
    10.1109/SYNASC.2012.14
  • Filename
    6481031