Title :
Dependable performance analysis for fuzzy clustering of web usage data
Author :
Ketata, Amir ; Mudur, Sudhir ; Shiri, Nematollaah
fDate :
March 30 2009-April 2 2009
Abstract :
Fuzzy clustering is a popular method for modeling web usage data, and a number of techniques have been proposed. Performance of such techniques has been demonstrated through experiments using datasets which are often limited in the size and/or variety. This is mainly due to the difficulty in acquiring large real data, and also to the huge amount of time and effort required in performing experiments. We investigate ways to ensure dependability of such results and their analyses. For this we consider three issues. First we need to ensure that the clustering quality indices used for comparing different techniques are not biased towards any parameter specific to any of them. Second, more ground truth is provided by measuring the quality through an application of the usage model than through the clustering quality index alone. Third, given the limited data sets and experiments, use of statistical significance testing can provide more confidence in that the results obtained are not by mere chance. We present our approach for dependable performance analysis using some well-known fuzzy clustering techniques along with prediction quality used as the application specific metric.
Keywords :
Internet; data mining; fuzzy set theory; pattern clustering; Web usage data; dependable performance analysis; fuzzy clustering; prediction quality; Algorithm design and analysis; Clustering algorithms; Fuzzy sets; Performance analysis; Predictive models; Recommender systems; Testing; Web mining;
Conference_Titel :
Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2765-9
DOI :
10.1109/CIDM.2009.4938660