• DocumentCode
    170588
  • Title

    Similarity analysis of DNA sequences based on k-word

  • Author

    Yingxin Hu ; Zhaohui Qi ; Lijuan Zheng ; Wenfeng Zhou

  • Author_Institution
    Coll. of Inf. Sci. & Technol., Shijiazhuang Tiedao Univ., Shijiazhuang, China
  • fYear
    2014
  • fDate
    16-18 May 2014
  • Firstpage
    621
  • Lastpage
    625
  • Abstract
    Based on the position information and numbers of k-words, a method is proposed to compare genetic sequences and infer evolutionary relationship. In this study a characteristic vector whose elements are the average distances from the beginning of the k-word is introduced to represent DNA sequences. The approach has one to one correspondence between DNA sequences and vectors. In the end, we choose 48 HEV (Hepatitis E virus) and some mammalian species as test datasets to reconstruct the phylogenetic trees based on Euclidean distance measure. With comparison to other methods, the results show that this method is efficient and suitable for similarity analysis.
  • Keywords
    DNA; biology computing; genetics; microorganisms; DNA sequences; Euclidean distance measure; HEV; Hepatitis E virus; characteristic vector; evolutionary relationship; genetic sequences; k-words; mammalian species; phylogenetic trees; position information; similarity analysis; Bioinformatics; DNA; Hybrid electric vehicles; Phylogeny; Strain; Vectors; DNA sequences; Phylogenetic Analysis; k-word;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Progress in Informatics and Computing (PIC), 2014 International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4799-2033-4
  • Type

    conf

  • DOI
    10.1109/PIC.2014.6972409
  • Filename
    6972409