DocumentCode :
3078584
Title :
Association rule based frequent pattern mining in biological sequences
Author :
Salim, Azzeddine ; Chandra, S. S. Vinod
Author_Institution :
Dept. of Comput.-Sci. & Eng., Coll. of Eng., Thiruvananthapuram, India
fYear :
2013
fDate :
26-28 Dec. 2013
Firstpage :
1
Lastpage :
5
Abstract :
To find all frequent patterns present in a set of strings is computationally intensive. An exhaustive search, where every possible candidate is taken into consideration, is not practical for larger pattern widths due to exponential computational complexity. Other approaches apply heuristics, where algorithm tries to reduce search space, but may compromise the accuracy of results to certain extent. We used modified Apriori algorithm to mine possible patterns in a very long sequence, especially most frequent substring pattern of a fixed length in biological sequence. The algorithm gives good performance by rapid reduction in search space, and computations using bit-wise operations instead of expensive string comparison operations. This algorithm outperform existing pattern finding methods such as MEME in terms of execution time.
Keywords :
biology computing; data mining; genomics; string matching; association rule based frequent pattern mining; biological sequences; bit-wise operations; execution time; modified apriori algorithm; most-frequent substring pattern; search space reduction; Algorithm design and analysis; Bioinformatics; Databases; Generators; Genomics; Pattern matching; Apriori; Genomic Sequences; Most Frequent Pattern;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Computing Research (ICCIC), 2013 IEEE International Conference on
Conference_Location :
Enathi
Print_ISBN :
978-1-4799-1594-1
Type :
conf
DOI :
10.1109/ICCIC.2013.6724203
Filename :
6724203
Link To Document :
بازگشت