• DocumentCode
    3275439
  • Title

    Discovering maximal subsequence patterns in sequence database

  • Author

    Singhal, Leena ; Jain, Neha ; Gupta, Geeta ; Gupta, Neelima

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Delhi, Delhi, India
  • fYear
    2009
  • fDate
    14-15 Dec. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Mining sequential patterns in biological data has attracted a great deal of attention in the last couple of years. Biologists are interested in finding the frequent orderly arrangement of motifs that may be responsible for similar expression of a group of genes. The size of the output space can be greatly reduced if only the maximal frequent patterns are reported. In this paper we present maximal PrefixSpan algorithm which reports maximal frequent patterns in the sequence database. Experimental results on synthetic data shows that the size of the output space is greatly reduced when only maximal frequent patterns are reported.
  • Keywords
    biology computing; data mining; biological data; maximal PrefixSpan algorithm; maximal frequent pattern; maximal subsequence pattern discovery; sequence database; sequential pattern mining; Computer science; Costs; Data mining; Databases; Proteins; Sampling methods; Testing; Maximal frequent sequences; Sequence mining; TFBS;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Methods and Models in Computer Science, 2009. ICM2CS 2009. Proceeding of International Conference on
  • Conference_Location
    Delhi
  • Print_ISBN
    978-1-4244-5051-0
  • Type

    conf

  • DOI
    10.1109/ICM2CS.2009.5397958
  • Filename
    5397958