Title :
GHIC: a hierarchical pattern-based clustering algorithm for grouping Web transactions
Author :
Yang, Yinghui ; Padmanabhan, Balaji
Author_Institution :
Graduate Sch. of Manage., California Univ., Davis, CA, USA
Abstract :
Grouping customer transactions into segments may help understand customers better. The marketing literature has concentrated on identifying important segmentation variables (e.g., customer loyalty) and on using cluster analysis and mixture models for segmentation. The data mining literature has provided various clustering algorithms for segmentation without focusing specifically on clustering customer transactions. Building on the notion that observable customer transactions are generated by latent behavioral traits, in this paper, we investigate using a pattern-based clustering approach to grouping customer transactions. We define an objective function that we maximize in order to achieve a good clustering of customer transactions and present an algorithm, GHIC, that groups customer transactions such that itemsets generated from each cluster, while similar to each other, are different from ones generated from others. We present experimental results from user-centric Web usage data that demonstrates that GHIC generates a highly effective clustering of transactions.
Keywords :
Internet; consumer behaviour; customer services; data mining; pattern classification; pattern clustering; transaction processing; GHIC; Web mining; Web transactions grouping; association rules; cluster analysis; customer behaviour; customer loyalty; customer transactions grouping; data mining; hierarchical pattern-based clustering algorithm; mixture models; user-centric Web usage data; Association rules; Clustering algorithms; Data mining; Itemsets; Parametric statistics; Web mining; Index Terms- Data mining; Web mining.; association rules; classification; clustering;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2005.145