Title :
On top-k closed sequential patterns mining
Author :
Jing Wang ; Lei Zhang ; Guiquan Liu ; Qi Liu ; Enhong Chen
Author_Institution :
Univ. of Sci. & Technol. of China, Hefei, China
Abstract :
Sequential pattern mining is a very significant data mining project. In this area, most of the previous studies require us to provide a support threshold to accomplish the mining. However, in reality providing an appropriate threshold is very difficult if we did not acquaint with some background information relevant to the data. In addition, there exist many useless sequential patterns when the least support is too low. An alternative task is proposed to solve the above problems: mining top-k frequent closed sequences with the least length constraint, that is, mining k most frequent closed sequences whose length are equal or more than min_len. However, most of the previous algorithms are based on the framework of candidate and generation, thus leading too much space usage and running time. To this end, in this paper, we propose a very efficient algorithm named BI-TSP(Mining top-k closed sequential patterns with BI-Directional checking scheme) without candidate and generation for mining top-k frequent closed sequences with the least length. Specifically, we adopt BI-Directional Extension for frequent closed sequential patterns enumeration. Based on BI-Directional Extension, we can directly use the closure checking scheme and effectively raise the minimum support threshold without candidate maintenance. In addition, we also propose two novel pruning strategies by exploiting the properties of minimum length constraint. Our extensive performance test with synthetic and real datasets demonstrates that BI-TSP outperforms the baselines in both memory and running time.
Keywords :
data mining; pattern classification; pattern clustering; BI-directional checking scheme; TSP; data mining; top-k closed sequential pattern mining; Bidirectional control; Data mining; Databases; Heuristic algorithms; Maintenance engineering; Silicon; Upper bound;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2014 11th International Conference on
Conference_Location :
Xiamen
Print_ISBN :
978-1-4799-5147-5
DOI :
10.1109/FSKD.2014.6980849