DocumentCode
2973267
Title
PAID: Mining Sequential Patterns by Passed Item Deduction in Large Databases
Author
Yang, Zhenglu ; Kitsuregawa, Masaru ; Wang, Yitong
Author_Institution
Dept. of Inf. & Commun. Eng., Tokyo Univ.
fYear
2006
fDate
Dec. 2006
Firstpage
113
Lastpage
120
Abstract
Sequential pattern mining is very important because it is the basis of many applications. Yet how to efficiently implement the mining is difficult due to the inherent characteristic of the problem - the large size of the dataset. Although there has been a great deal of effort on sequential pattern mining in recent years, its performance is still far from satisfactory. In this paper, we have proposed a new algorithm called passed item deduced sequential pattern mining (abbreviated as PAID), which can efficiently get all the frequent sequential patterns from a large database. The main difference between our strategy and the existing works is that other algorithms accumulate the candidate support in each iteration from scratch, in contrast, PAID makes good use of the temporary results (support value) of k-length frequent patterns on discovering (k+1)-length patterns, which can reduce the search space greatly in mining sequential patterns. Our experimental results and performance studies show that PAID outperforms the previous works by meaningful margins on large datasets
Keywords
data mining; very large databases; large databases; passed item deduction; search space; sequential pattern mining; Application software; Association rules; Biology; Computer science; Data mining; Deductive databases; Itemsets; Pattern analysis; Surges; Time factors;
fLanguage
English
Publisher
ieee
Conference_Titel
Database Engineering and Applications Symposium, 2006. IDEAS '06. 10th International
Conference_Location
Delhi
ISSN
1098-8068
Print_ISBN
0-7695-2577-6
Type
conf
DOI
10.1109/IDEAS.2006.34
Filename
4041610
Link To Document