Title :
On Constructing Regular Expressions of Web Page Traversals for Efficient Filtering
Author :
Kumaravel, A. ; Pradeepa, R.
Author_Institution :
Dept. of Comput. Sci. & Eng., Bharath Univ., Chennai, India
Abstract :
In order to reflect the importance of webpages in the web mining result, an algorithm based on regular expression constraints, is presented for mining weighted closed sequential patterns. At first, scans the sequence database once, finds all the weighted frequent items and make a sparse matrix to identify significant item set and satisfying minimum support and the minimum weighted support count. Secondly, the index sets based on memory for frequent weighted items are constructed, the stems of frequent weighted items are found in the index sets, then the frequent weighted items are extended by stems. It filters the sequential patterns that cannot satisfy the user-specified regular expression constraints, so the search space is reduced. The support of discovered sequential patterns is saved in the hash table as the key to do the closure checking. The algorithm allows users to change constraints to implement interactive mining, and also facilitates the users to focus the mining on their interesting patterns and reflect the importance of item. The experimental results show that this method is more efficient in mining weighted closed sequential pattern and satisfying the requirement of users.
Keywords :
Web sites; data mining; file organisation; information filtering; sparse matrices; Web mining result; Web page traversals; closure checking; efficient filtering; frequent weighted items; hash table; index sets; regular expression constraints; regular expressions; sequence database; sparse matrix; weighted closed sequential patterns; Algorithm design and analysis; Data mining; Educational institutions; Indexes; Sparse matrices; Web pages; Minimum Weighted Support Count; Regular Expression; Sparse Matrix; Weighted Closed Sequential Pattern;
Conference_Titel :
Radar, Communication and Computing (ICRCC), 2012 International Conference on
Conference_Location :
Tiruvannamalai
Print_ISBN :
978-1-4673-2756-5
DOI :
10.1109/ICRCC.2012.6450567