Title :
Efficient association rule mining using indexing support
Author :
Rao, Vedula Venkateswara
Author_Institution :
Dept. of CSE, SRI Vasavi Eng. Coll., Tadepalligudem, India
Abstract :
This paper presents the Btree index, a general and compact structure which provides tight integration of item set extraction in a relational DBMS. The Databases may be Transactional Databases or Relational Databases. Since no constraint is enforced during the index creation phase, Btree index provides a complete representation of the original database. The Btree Index creates prefix tree like structure and stores addresses of disk blocks where relational database records are stored. Along with the disk blocks Btree index maintains pointers to disk blocks. Using these pointers it extracts required data base records from database into main memory and finds corresponding item sets for association rule mining. To reduce the I/O cost, data accessed together during the same extraction phase are clustered on the same disk block. The Btree index structure can be efficiently exploited by different item set extraction algorithms. In particular, Btree index data access methods currently support the FP-growth, IBTree, FPMax and LCM v.2 algorithms, but they can straightforwardly support the enforcement of various constraint categories. The Btree index has been integrated into the SQL SERVER, PostgreSQL, Oracle DBMS and exploits its physical level access methods. Experiments, run for both sparse and dense data distributions, show the efficiency of the proposed index and its linear scalability also for large data sets. Item set mining supported by the Btree index shows performance always comparable with, and often (especially for low supports) better than, state-of-the-art algorithms accessing data on flat file.
Keywords :
SQL; data mining; relational databases; trees (mathematics); Btree index data access methods; Btree index structure; Oracle DBMS; PostgreSQL; SQL SERVER; association rule mining; indexing support; item set extraction algorithm; item set mining; relational DBMS; transactional databases; Algorithm design and analysis; Association rules; Indexing; Vegetation; Association Rule Mining; Btree; Data Mining; Fpgrowth; Indexing; Item Set Extraction; LCM; Market Basket Analysis;
Conference_Titel :
Recent Trends in Information Technology (ICRTIT), 2011 International Conference on
Conference_Location :
Chennai, Tamil Nadu
Print_ISBN :
978-1-4577-0588-5
DOI :
10.1109/ICRTIT.2011.5972386