• DocumentCode
    608086
  • Title

    IC-BIDE: Intensity Constraint-Based Closed Sequential Pattern Mining for Coding Pattern Extraction

  • Author

    Takei, H. ; Yamana, Hayato

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Waseda Univ., Tokyo, Japan
  • fYear
    2013
  • fDate
    25-28 March 2013
  • Firstpage
    976
  • Lastpage
    983
  • Abstract
    We propose intensity constraint-based closed sequential pattern mining algorithm, called IC-BIDE, for a coding pattern extraction. Source code often contains frequent patterns of function calls or control flows, i.e., "coding patterns." Previous studies used sequential pattern mining to extract coding pattern, however, these algorithms have not been optimized for coding pattern extraction, which results in useless patterns as well as long execution times. We propose a new constraint, called "intensity constraint," in order to enhance closed sequential pattern mining and efficiently extract coding patterns. Our proposed algorithm is based on BI-Directional Execution (BIDE), an algorithm proposed expressly for closed sequential pattern mining. BIDE algorithm is not able to adapt to constraint-based closed sequential pattern mining. We extend BIDE algorithm and prove that our extended algorithm is able to adapt to intensity constraint-based closed sequential pattern mining. Our contributions are as follow, 1) We propose a new constraint, which we call "intensity", 2) We propose intensity constraint-based closed sequential pattern mining algorithm, which we call "IC-BIDE" algorithm. Experimental results with open source software (Bullet Physics, MySQL, and OpenCV) show that IC-BIDE algorithm successfully excludes useless pattern effectively. Moreover, our proposed method is able to accelerate the extraction by a factor of 8.9 in comparison with the BIDE algorithm.
  • Keywords
    data mining; encoding; public domain software; Bullet Physics; IC-BIDE; MySQL; OpenCV; bidirectional execution; coding pattern extraction; intensity constraint-based closed sequential pattern mining; open source software; source code; Algorithm design and analysis; Bidirectional control; Data mining; Databases; Encoding; Software; Software algorithms; closed sequential pattern mining; coding pattern extraction; constraint-based pattern mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1550-445X
  • Print_ISBN
    978-1-4673-5550-6
  • Electronic_ISBN
    1550-445X
  • Type

    conf

  • DOI
    10.1109/AINA.2013.79
  • Filename
    6531859