DocumentCode
2403862
Title
FAST: a new sampling-based algorithm for discovering association rules
Author
Bin Chen ; Haas, Peter J. ; Scheuermann, Peter
fYear
2002
fDate
2002
Firstpage
263
Abstract
We present FAST (finding associations from sampled transactions), a refined sampling-based mining algorithm that is distinguished from prior algorithms by its novel two-phase approach to sample collection. In phase I a large sample is collected to quickly and accurately estimate the support of each item in the database. In phase II, a small final sample is obtained by excluding "outlier" transactions in such a manner that the support of each item in the final sample is as close as possible to the estimated support of the item in the entire database. We propose two approaches to obtaining the final sample in phase II: trimming and growing. The trimming procedure starts from the large initial sample and removes outlier transactions until a specified stopping criterion is satisfied. In contrast, the growing procedure selects representative transactions from the initial sample and adds them to an initially empty data set
Keywords
data mining; database management systems; transaction processing; FAST; association rule discovery; finding associations from sampled transactions; growing; sample collection; sampling-based mining algorithm; trimming; Association rules; Frequency conversion; Frequency measurement; Itemsets; Measurement standards; Phase estimation; Sampling methods; Size measurement; Transaction databases; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location
San Jose, CA
ISSN
1063-6382
Print_ISBN
0-7695-1531-2
Type
conf
DOI
10.1109/ICDE.2002.994717
Filename
994717
Link To Document