Title :
An improved parallel algorithm for sequence mining
Author :
She, Chundong ; Tang, Jian ; Li, Lei ; Wang, Hongbing ; Fan, Zhihua
Author_Institution :
Inst. of Software, Chinese Acad. of Sci., Beijing, China
Abstract :
It is more and more important in data mining field to finding the frequent sequences in a large database. The paper briefly introduces the basic concept of frequent sequence mining and presents the data parallel formulation and task parallel formulation of tree-projection based algorithm. Moreover, the on-line LPT algorithm is used to successfully solve the problem of imbalance for the task parallel formulation. Our experiment shows that these algorithms are capable of achieving good speedups. However, the task parallel formulation is more scalable than the data parallel one.
Keywords :
data mining; parallel algorithms; trees (mathematics); very large databases; data mining; data parallel formulation; frequent sequence mining; large database; online LPT algorithm; parallel algorithm; task parallel formulation; tree-projection based algorithm; Concurrent computing; Data mining; Databases; Distributed computing; Frequency; Parallel algorithms; Parallel processing; Partitioning algorithms; Sequences; Web pages;
Conference_Titel :
Mechatronics and Automation, 2005 IEEE International Conference
Conference_Location :
Niagara Falls, Ont., Canada
Print_ISBN :
0-7803-9044-X
DOI :
10.1109/ICMA.2005.1626812