DocumentCode
1659624
Title
Rough set clustering for Web mining
Author
Lingras, Pawan
Author_Institution
Saint Mary´´s Univ., Halifax, NS, Canada
Volume
2
fYear
2002
fDate
6/24/1905 12:00:00 AM
Firstpage
1039
Lastpage
1044
Abstract
Similar to traditional data mining, three important Web mining operations include clustering, association, and sequential analysis. Typical clustering operations in Web mining involve finding natural groupings of Web resources or Web users. Researchers have pointed out some important differences between clustering in conventional applications and clustering in Web mining. For example, the clusters and associations in Web mining do not necessarily have crisp boundaries. Moreover, due to a variety of reasons inherent in Web browsing and Web logging, the likelihood of bad or incomplete data is higher. As a result, researchers have studied the possibility of using fuzzy sets in Web mining clustering applications. The paper describes how rough set theory can also be used to develop clustering schemes for Web mining. The unsupervised classification described in the paper uses properties of rough sets along with genetic algorithms to represent clusters as interval sets. The paper also describes the design of an experiment including data collection and the clustering process. The experiment is used to create interval set representations of groups of Web visitors
Keywords
data mining; genetic algorithms; information resources; information retrieval; pattern clustering; rough set theory; Web browsing; Web mining; Web resources; Web users; association; clustering; genetic algorithms; interval sets; logging; natural groupings; rough set clustering; sequential analysis; unsupervised classification; Bioinformatics; Data mining; Electronic mail; Fuzzy sets; Genetic algorithms; Genomics; Rough sets; Set theory; Web mining; Web sites;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
0-7803-7280-8
Type
conf
DOI
10.1109/FUZZ.2002.1006647
Filename
1006647
Link To Document