Title :
PAID: Mining Sequential Patterns by Passed Item Deduction in Large Databases
Author :
Yang, Zhenglu ; Kitsuregawa, Masaru ; Wang, Yitong
Author_Institution :
Dept. of Inf. & Commun. Eng., Tokyo Univ.
Abstract :
Sequential pattern mining is very important because it is the basis of many applications. Yet how to efficiently implement the mining is difficult due to the inherent characteristic of the problem - the large size of the dataset. Although there has been a great deal of effort on sequential pattern mining in recent years, its performance is still far from satisfactory. In this paper, we have proposed a new algorithm called passed item deduced sequential pattern mining (abbreviated as PAID), which can efficiently get all the frequent sequential patterns from a large database. The main difference between our strategy and the existing works is that other algorithms accumulate the candidate support in each iteration from scratch, in contrast, PAID makes good use of the temporary results (support value) of k-length frequent patterns on discovering (k+1)-length patterns, which can reduce the search space greatly in mining sequential patterns. Our experimental results and performance studies show that PAID outperforms the previous works by meaningful margins on large datasets
Keywords :
data mining; very large databases; large databases; passed item deduction; search space; sequential pattern mining; Application software; Association rules; Biology; Computer science; Data mining; Deductive databases; Itemsets; Pattern analysis; Surges; Time factors;
Conference_Titel :
Database Engineering and Applications Symposium, 2006. IDEAS '06. 10th International
Conference_Location :
Delhi
Print_ISBN :
0-7695-2577-6
DOI :
10.1109/IDEAS.2006.34