DocumentCode
3275439
Title
Discovering maximal subsequence patterns in sequence database
Author
Singhal, Leena ; Jain, Neha ; Gupta, Geeta ; Gupta, Neelima
Author_Institution
Dept. of Comput. Sci., Univ. of Delhi, Delhi, India
fYear
2009
fDate
14-15 Dec. 2009
Firstpage
1
Lastpage
5
Abstract
Mining sequential patterns in biological data has attracted a great deal of attention in the last couple of years. Biologists are interested in finding the frequent orderly arrangement of motifs that may be responsible for similar expression of a group of genes. The size of the output space can be greatly reduced if only the maximal frequent patterns are reported. In this paper we present maximal PrefixSpan algorithm which reports maximal frequent patterns in the sequence database. Experimental results on synthetic data shows that the size of the output space is greatly reduced when only maximal frequent patterns are reported.
Keywords
biology computing; data mining; biological data; maximal PrefixSpan algorithm; maximal frequent pattern; maximal subsequence pattern discovery; sequence database; sequential pattern mining; Computer science; Costs; Data mining; Databases; Proteins; Sampling methods; Testing; Maximal frequent sequences; Sequence mining; TFBS;
fLanguage
English
Publisher
ieee
Conference_Titel
Methods and Models in Computer Science, 2009. ICM2CS 2009. Proceeding of International Conference on
Conference_Location
Delhi
Print_ISBN
978-1-4244-5051-0
Type
conf
DOI
10.1109/ICM2CS.2009.5397958
Filename
5397958
Link To Document