Title :
Integrating Frequent Itemsets Mining with Relational Database
Author_Institution :
Shandong Inst. of Bus. & Technol., YanTai
Abstract :
Frequent itemsets mining is becoming increasingly important since the size of databases grows even larger. Currently database systems are dominated by relational database. However the performance of SQL based data mining is known to fall behind specialized implementation and expensive mining tools being on sale. In this paper we analyzed a famous frequent itemsets discovery algorithms FP-growth, and propose a new implementation approach called DBFP-Growth to create disk-based FP-tree based on ORACLE PL/SQL, it can execute faster than using SQL directly. Also presents a novel SQL-based method DRRW, which can remove duplicate records from database without temp table generation.
Keywords :
SQL; data mining; relational databases; DBFP-growth; ORACLE PL/SQL; SQL based data mining; database systems; duplicate records; frequent itemsets mining; mining tools; relational database; Algorithm design and analysis; Consumer electronics; Data mining; Database systems; Instruments; Itemsets; Marketing and sales; Relational databases; Size measurement; Transaction databases; Data mining; Database; Frequent itemset;
Conference_Titel :
Electronic Measurement and Instruments, 2007. ICEMI '07. 8th International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-1-4244-1136-8
Electronic_ISBN :
978-1-4244-1136-8
DOI :
10.1109/ICEMI.2007.4350737