Title of article :
Research on a frequent maximal induced subtrees mining method based on the compression tree sequence
Author/Authors :
Wang، نويسنده , , Jing and Liu، نويسنده , , Zhaojun and Li، نويسنده , , Wei and Li، نويسنده , , Xiongfei، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2015
Pages :
7
From page :
94
To page :
100
Abstract :
Most complex data structures can be represented by a tree or graph structure, but tree structure mining is easier than graph structure mining. With the extensive application of semi-structured data, frequent tree pattern mining has become a hot topic. This paper proposes a compression tree sequence (CTS) to construct a compression tree model; and save the information of the original tree in the compression tree. As any subsequence of the CTS corresponds to a subtree of the original tree, it is efficient for mining subtrees. Furthermore, this paper proposes a frequent maximal induced subtrees mining method based on the compression tree sequence, CFMIS (compressed frequent maximal induced subtrees). The algorithm is primarily performed via four stages: firstly, the original data set is constructed as a compression tree model; then, a cut-edge reprocess is run for the edges in which the edge frequent is less than the threshold; next, the tree is compressed after the cut-edge based on the different frequent edge degrees; and, last, frequent subtree sets maximal processing is run such that, we can obtain the frequent maximal induced subtree set of the original data set. For each iteration, compression can reduce the size of the data set, thus, the traversal speed is faster than that of other algorithms. Experiments demonstrate that our algorithm can mine more frequent maximal induced subtrees in less time.
Keywords :
DATA MINING , Induced subtree , Maximal subtree , Frequent subtree , Compression , CFMIS
Journal title :
Expert Systems with Applications
Serial Year :
2015
Journal title :
Expert Systems with Applications
Record number :
2355362
Link To Document :
بازگشت