Title :
On Mining Repeating Pattern with Gap Constraint
Author :
Chiu, Shin-Yi ; Chiu, Shih-Chuan ; Huang, Jiun-Long
Author_Institution :
Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
We in this paper propose a new concept, repeating patterns with gap constraint, to make repeating patterns tolerate the delay of events. To mine repeating patterns with gap constraint, we first show the anti-monotonic property of repeating patterns with gap constraint and then propose a level-wise algorithm, named G-Apriori (standing for Gap with Apriori), based on the anti-monotonic property. Similar to other level-wise mining algorithms such as Apriori, algorithm G-Apriori will scan databases several times to count the number of occurrences of each candidate repeating pattern. Such phenomenon makes G-Apriori spend much time in disk I/O, thereby making G-Apriori not suitable for large databases. In view of this, we develop an index structure to record the positions of the occurrences of each repeating pattern, and then propose algorithm GwI-Apriori (standing for gap with index Apriori) to utilize the index structure to reduce the number of database scans when mining repeating patterns with gap constraint. The experimental results show that algorithm GwI-Apriori is more scalable than algorithm G-Apriori in terms of execution time.
Keywords :
data mining; pattern recognition; G-Apriori; GwIAprior; antimonotonic property; gap constraint; index Apriori; level-wise algorithm; repeating pattern mining; Computer science; DNA; Data mining; Databases; Delay; High performance computing; Indexes; Music information retrieval; Sequences; Uninterruptible power systems;
Conference_Titel :
High Performance Computing and Communications, 2009. HPCC '09. 11th IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-4600-1
Electronic_ISBN :
978-0-7695-3738-2
DOI :
10.1109/HPCC.2009.65