• DocumentCode
    2744347
  • Title

    A Sequential Pattern Mining Algorithm Based on Improved FP-tree

  • Author

    Sui, Yi ; Shao, FengJing ; Sun, Rencheng ; Wang, Jinlong

  • Author_Institution
    Coll. of Inf. & Eng., Qingdao Univ., Qingdao
  • fYear
    2008
  • fDate
    6-8 Aug. 2008
  • Firstpage
    440
  • Lastpage
    444
  • Abstract
    Sequential pattern mining is an important data mining problem with broad application. Most of the previously developed sequential pattern mining methods need to scan the database many times. In this study, STMFP algorithm based on improved FP-tree is presented for sequential pattern mining. By improving the FP-tree structure, every node of the tree can store a set of items instead of one item. After scanning the sequential database once time, the tree can store all the sequences. In addition, a novel mining method, combining nodes from leaf to root which helps mining sequential patterns, is proposed. The cost of mining pattern sequence is divided into two parts. One is to construct STMFP Tree. The cost of this part associates with the size of sequential database. Another one is to find random assembled nodes from leaf to root in every path of STMFP tree. Because the maximal length of path is bounded by the maximal length of one transaction, and there are exiting common nodes which help reduce the number of leaf nodes, so the cost of this part must be much less than the size of the database. Compared with other methods which need to scan the sequential database many times, the cost of our method must be less than two passes of the database. Through the whole mining process, it only needs scan the database once time.
  • Keywords
    data mining; database management systems; tree data structures; FP-tree structure; STMFP algorithm; data mining; sequential database; sequential pattern mining algorithm; Artificial intelligence; Association rules; Costs; Data engineering; Data mining; Distributed computing; Software algorithms; Software engineering; Sun; Transaction databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2008. SNPD '08. Ninth ACIS International Conference on
  • Conference_Location
    Phuket
  • Print_ISBN
    978-0-7695-3263-9
  • Type

    conf

  • DOI
    10.1109/SNPD.2008.161
  • Filename
    4617411