• DocumentCode
    3341927
  • Title

    Fast Dynamic Programming Based Sequence Alignment Algorithm

  • Author

    Rashid, N.A.A. ; Abdullah, Rosni ; Talib, Abdullah Zawawi Haji ; Ali, Zalila

  • Author_Institution
    Sch. of Comput. Sci., Sci. Univ. of Malaysia, Penang
  • fYear
    2006
  • fDate
    38838
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Protein sequence alignment is basic operation mostly used in protein sequence analysis. The most optimal algorithm used in sequence alignment is based on the dynamic programming method. Smith-Waterman algorithm is the most commonly used dynamic programming based sequence alignment algorithm. However the algorithm uses quadratic time and space. Heuristic algorithm such as FASTA and BLAST were introduced to speed up the sequence alignment algorithm. FASTA is based on word search whereas BLAST is based on maximum segment pairs. In word search algorithm, lists of words from the query and database sequence are being compared to determine if two sequences have a region of sufficient similarity to merit further alignment using the Smith-Waterman Algorithm. All the different algorithms use the substitutions matrix based on the twenty alphabet amino acids. However research shows that reducing the number of amino acids to 10 does not affect the similarity measure. Our proposed algorithm uses the reduced amino acids alphabet to transform the protein sequences into a sequence of integer and uses n-gram to reduce the length of the sequence. Then the Smith-Waterman algorithm is used to get the similarity measure between two sequences. Result shows that the new proposed algorithm is as sensitive as the Smith-Waterman algorithm yet uses less space and performs better
  • Keywords
    biology computing; dynamic programming; heuristic programming; molecular biophysics; proteins; query processing; BLAST; FASTA; Smith-Waterman algorithm; amino acids; dynamic programming; protein sequence alignment; Algorithm design and analysis; Amino acids; Biomedical signal processing; Databases; Dynamic programming; Heuristic algorithms; Natural language processing; Protein sequence; Signal processing algorithms; Space technology; N-Gram methods; Reduce Amino Acids alphabet; Sequence Alignment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Frameworks for Multimedia Applications, 2006. The 2nd International Conference on
  • Conference_Location
    Pulau Pinang
  • Print_ISBN
    1-4244-0409-6
  • Type

    conf

  • DOI
    10.1109/DFMA.2006.296909
  • Filename
    4077734