Title :
Parallel frequent itemset mining on streaming data
Author :
Yanshan He ; Min Yue
Author_Institution :
Electron. & Inf. Sci. Dept., Lanzhou Jiaotong Univ., Lanzhou, China
Abstract :
Owing to the widely used of data stream, frequent itemset mining on data stream have received more attention. Data stream is fast changing, massive, and potentially infinite. Therefore, we have to establish new data structure and algorithm to mine it. On the base of our previous work, we propose a new paralleled frequent itemset mining algorithm for data stream based on sliding window, which is called PFIMSD. The algorithm compresses whole data in current window into PSD-trees on paralleled processor only by one-scan. Increment method is used to append or delete related branch on PSD-tree when window is sliding. The experiment shows PFIMSD algorithm has good performance on efficiency and expansibility.
Keywords :
data compression; data mining; parallel processing; tree data structures; PFIMSD algorithm; PSD-trees; branch appending; branch deletion; data compression; data streaming; data structure; increment method; parallel frequent itemset mining; paralleled processor; sliding window; Algorithm design and analysis; Approximation algorithms; Data mining; Data structures; Itemsets; Parallel algorithms; Frequent Itemset Mining; Frequent Pattern; High Performance; Paralleled; Streaming Data;
Conference_Titel :
Natural Computation (ICNC), 2014 10th International Conference on
Conference_Location :
Xiamen
Print_ISBN :
978-1-4799-5150-5
DOI :
10.1109/ICNC.2014.6975926