مرکز منطقه ای اطلاع رساني علوم و فناوري - Scalable APRIORI-Based Frequent Pattern Discovery

DocumentCode :

1733864

Title :

Scalable APRIORI-Based Frequent Pattern Discovery

Author :

Chester, Sean ; Sandler, Ian ; Thomo, Alex

Author_Institution :

Univ. of Victoria, Victoria, BC, Canada

Volume :

fYear :

2009

Firstpage :

Lastpage :

Abstract :

Frequent pattern discovery, the task of finding sets of items that frequently occur together in a dataset, has been at the core of the field of data mining for the past sixteen years. In that time, the size of datasets has grown much faster than has the ability of existing algorithms to handle those datasets. Consequently, improvements are needed. In this paper, we take the classic algorithm for the problem, A priori, and by adding a vertical sort drastically improve its performance characteristics when processing very large datasets. We use the benchmark large dataset webdocs from the FIMI 2004 conference to contrast our performance against several state-of-the-art implementations and demonstrate both equal efficiency with lower memory usage at all support thresholds and also the ability to mine support thresholds as yet unattempted in literature. We also indicate how this work can be extended to achieve yet more impressive results.

Keywords :

data mining; data mining; frequent pattern discovery; scalable Apriori; Data engineering; Data mining; Design engineering; Frequency; Itemsets; Sorting; Technological innovation; apriori; data mining; frequent pattern discovery;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computational Science and Engineering, 2009. CSE '09. International Conference on

Conference_Location :

Vancouver, BC

Print_ISBN :

978-1-4244-5334-4

Electronic_ISBN :

978-0-7695-3823-5

Type :

conf

DOI :

10.1109/CSE.2009.51

Filename :

5283015

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1733864