Title :
Novel position-coded methods for mining web access patterns
Author :
Wang, Wenjia ; Cao-Thai, Phuong Thanh
Author_Institution :
Sch. of Comput. Sci., East Anglia Univ., Norwich
Abstract :
This paper briefly introduces two novel algorithms - PLWAPI and PLWAP2, modified from the position pre-ordered linked Web access pattern (PLWAP) algorithm, for mining Web access pattern(WAP)s from web usage log data. Their basic ideas are to create a new header that links only the nodes under the new root of a WAP tree, and reuse the WAP tree through cloning in every recursion of the mining process. They have been tested against three other existing popular algorithms, i.e. WAP tree, the conditional sequence (CS) and PLWAP, with some synthetic benchmark data and real-world web log data collected from two Web sites. The experimental results indicated that PLWAP2 performed slightly better than PLWAP whilst using less memory, but PLWAPI is much more efficient than all the others, particularly when used for large data sets with long (more than 10 events) access sequences.
Keywords :
Internet; data mining; PLWAP2; PLWAPI; WAP tree; Web access pattern mining; Web usage log data; conditional sequence; position preordered linked Web access pattern; position-coded method; Algorithm design and analysis; Artificial intelligence; Frequency; Helium; IEEE members; Information analysis; Internet; Pattern analysis; Social network services; Uniform resource locators;
Conference_Titel :
Intelligence and Security Informatics, 2008. ISI 2008. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2414-6
Electronic_ISBN :
978-1-4244-2415-3
DOI :
10.1109/ISI.2008.4565054