DocumentCode
2677755
Title
Data organization and access for efficient data mining
Author
Dunkel, Brian ; Soparkar, Nandit
Author_Institution
Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA
fYear
1999
fDate
23-26 Mar 1999
Firstpage
522
Lastpage
529
Abstract
Efficient mining of data presents a significant challenge, due to problems of combinatorial explosion in the space and time often required for such processing. While previous work has focused on improving the efficiency of the mining algorithms, we consider how the representation, organization, and access of the data may significantly affect performance, especially when I/O costs are also considered. By a simple analysis and comparison of the counting stage for the a priori association rules algorithm, we show that a “column-wise” approach to data access is often more efficient than the standard row-wise approach. We also provide the results of empirical simulations to validate our analysis. The key idea in our approach is that counting in the a priori algorithm with data accessed in a column-wise manner, significantly reduces the number of disk accesses required to identify itemsets with a minimum support in the database-primarily by reducing the degree to which data and counters need to be repeatedly brought into memory
Keywords
data handling; data mining; information retrieval; I/O costs; a priori association rules algorithm; combinatorial explosion; data access; data mining; data organization; disk accesses; itemsets; mining algorithms; standard row-wise approach; Algorithm design and analysis; Argon; Association rules; Computer science; Costs; Data mining; Delta modulation; Electrical capacitance tomography; Explosions; Itemsets;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 1999. Proceedings., 15th International Conference on
Conference_Location
Sydney, NSW
ISSN
1063-6382
Print_ISBN
0-7695-0071-4
Type
conf
DOI
10.1109/ICDE.1999.754968
Filename
754968
Link To Document