• DocumentCode
    2445111
  • Title

    Biosequence Classification using Sequential Pattern Mining and Optimization

  • Author

    Fotiadis, D.I. ; Exarchos, T.P. ; Tsipouras, M.G. ; Papaloukas, C.

  • Author_Institution
    Univ. of Ioannina, Ioannina
  • fYear
    2007
  • fDate
    8-11 Nov. 2007
  • Firstpage
    58
  • Lastpage
    61
  • Abstract
    In this paper we present a methodology for biosequence classification, which employs sequential pattern mining and optimization algorithms. In the first stage, a sequential pattern mining algorithm is applied to a set of biological sequences and the sequential patterns are extracted. Then, the score of each pattern with respect to each sequence is calculated using a scoring function and the score of each class under consideration is estimated. The scores of the patterns and classes are updated, multiplied by a weight. In the second stage an optimization technique is employed to calculate the weight values to achieve the optimal classification accuracy. The methodology is applied in the protein class and fold prediction problem. Extensive evaluation is carried out, using a dataset obtained from the Protein Data Bank.
  • Keywords
    biology computing; data mining; optimisation; pattern classification; biosequence classification; optimization algorithm; protein data bank; scoring function; sequential pattern mining; Application software; Biology; Computer science; DNA; Information systems; Intelligent systems; Optimization methods; Proteins; Sequences; Text categorization; Sequential pattern mining; biosequence classification; optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology Applications in Biomedicine, 2007. ITAB 2007. 6th International Special Topic Conference on
  • Conference_Location
    Tokyo
  • Print_ISBN
    978-1-4244-1868-8
  • Electronic_ISBN
    978-1-4244-1868-8
  • Type

    conf

  • DOI
    10.1109/ITAB.2007.4407423
  • Filename
    4407423