• DocumentCode
    472214
  • Title

    Protein Classification using Sequential Pattern Mining

  • Author

    Exarchos, Themis P. ; Papaloukas, Costas ; Lampros, Christos ; Fotiadis, Dimitrios I.

  • Author_Institution
    Dept. of Comput. Sci., Ioannina Univ.
  • fYear
    2006
  • fDate
    Aug. 30 2006-Sept. 3 2006
  • Firstpage
    5814
  • Lastpage
    5817
  • Abstract
    Protein classification in terms of fold recognition can be employed to determine the structural and functional properties of a newly discovered protein. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. One of the most efficient SPM algorithms, cSPADE, is employed for protein primary structure analysis. Then a classifier uses the extracted sequential patterns for classifying proteins of unknown structure in the appropriate fold category. The proposed methodology exhibited an overall accuracy of 36% in a multi-class problem of 17 candidate categories. The classification performance reaches up to 65% when the three most probable protein folds are considered
  • Keywords
    biology computing; data mining; molecular biophysics; pattern classification; pattern recognition; proteins; cSPADE; protein classification; protein primary structure analysis; sequence-based fold recognition; sequential pattern mining; Algorithm design and analysis; Cities and towns; Data mining; Genomics; Itemsets; Pattern recognition; Proteins; Scanning probe microscopy; Sequences; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE
  • Conference_Location
    New York, NY
  • ISSN
    1557-170X
  • Print_ISBN
    1-4244-0032-5
  • Electronic_ISBN
    1557-170X
  • Type

    conf

  • DOI
    10.1109/IEMBS.2006.260336
  • Filename
    4463129