DocumentCode :
390914
Title :
Association analysis with one scan of databases
Author :
Huang, Hao ; Wu, Xindong ; Relue, Richard
Author_Institution :
Dept. of Math & Comput. Sci., Colorado Sch. of Mines, Golden, CO, USA
fYear :
2002
fDate :
2002
Firstpage :
629
Lastpage :
632
Abstract :
Mining frequent patterns with an FP-tree avoids costly candidate generation and repeatedly occurrence frequency checking against the support threshold. It therefore achieves better performance and efficiency than Apriori-like algorithms. However the database still needs to be scanned twice to get the FP-tree. This can be very time-consuming when new data are added to an existing database because two scans may be needed for not only the new data but also the existing data. This paper presents a new data structure P-tree, Pattern Tree, and a new technique, which can get the P-tree through only one scan of the database and can obtain the corresponding FP-tree with a specified support threshold. Updating a P-tree with new data needs one scan of the new data only, and the existing data do not need to be re-scanned.
Keywords :
data mining; pattern recognition; tree data structures; very large databases; Apriori-like algorithms; FP-tree; P-tree data structure; Pattern Tree; association analysis; association rule; candidate generation; data mining; database scan; frequent pattern mining; large database; occurrence frequency checking; performance; support threshold; Association rules; Computer science; Data structures; Frequency; Itemsets; Iterative algorithms; Transaction databases; Tree data structures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-1754-4
Type :
conf
DOI :
10.1109/ICDM.2002.1184015
Filename :
1184015
Link To Document :
بازگشت