Title :
Relational clustering based on a new robust estimator with application to Web mining
Author :
Nasraoui, Olfa ; Krishnapuram, Raghu ; Joshi, Anupam
Author_Institution :
Missouri Univ., Columbia, MO, USA
Abstract :
Mining typical user profiles and URL associations from the vast amount of access logs is an important component of Web personalization. In this paper, we define the notion of a "“user session” as being a temporally compact sequence of Web accesses by a user. We also define a dissimilarity measure between two Web sessions that captures the organization of a Web site. To cluster the user sessions based on the pairwise dissimilarities, we introduce the relational fuzzy c-maximal density estimator (RFC-MDE) algorithm. RFC-MDE is robust and can deal with outliers that are typical in this application. We show real examples of the use of RFC-MDE for extraction of user profiles from log data, and and compare its performance to the standard non-Euclidean fuzzy c-means
Keywords :
data loggers; data mining; fuzzy set theory; information resources; maximum likelihood sequence estimation; pattern clustering; relational algebra; RFC-MDE algorithm; URL associations; Web site organization; World Wide Web data mining; World Wide Web personalization; access logs; dissimilarity measure; nonEuclidean fuzzy c-means clustering method; outliers; pairwise dissimilarities; performance; relational clustering; relational fuzzy c-maximal density estimator; robust estimator; temporally compact access sequence; user profiles; user session; Clustering algorithms; Contamination; Data mining; Databases; Noise robustness; Pollution measurement; Recommender systems; Search engines; Uniform resource locators; Web mining;
Conference_Titel :
Fuzzy Information Processing Society, 1999. NAFIPS. 18th International Conference of the North American
Conference_Location :
New York, NY
Print_ISBN :
0-7803-5211-4
DOI :
10.1109/NAFIPS.1999.781785