Title :
Web document clustering approach using wordnet lexical categories and fuzzy clustering
Author :
Gharib, T.F. ; Fouad, Mohammed M. ; Aref, MostafaM
Author_Institution :
Fac. of Comput. & Inf. Sci., Ain Shams Univ., Cairo
Abstract :
Web mining is defined as applying data mining techniques to the content, structure, and usage of Web resources. The three areas of Web mining are commonly distinguished: content mining, structure mining, and usage mining. In all these areas, a wide range of general data mining techniques, in particular association rule discovery, clustering, classification, and sequence mining, are employed and developed further to reflect the specific structures of Web resources and the specific questions posed in Web mining. In this paper, we introduced a Web document clustering approach that uses WordNet lexical categories and fuzzy c-means algorithm to improve the performance of clustering problem for Web document. Experiments show that fuzzy c-means algorithm achieves great performance optimization with comparison with the recent algorithms for document clustering.
Keywords :
Internet; content management; data mining; document handling; fuzzy set theory; pattern clustering; Web document clustering; Web mining; Web resource; WordNet lexical category; association rule discovery; content mining; data mining; fuzzy c-means algorithm; fuzzy clustering; sequence mining; structure mining; usage mining; Artificial intelligence; Association rules; Clustering algorithms; Clustering methods; Computer science; Data mining; Information science; Optimization; Text mining; Web mining;
Conference_Titel :
Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
Conference_Location :
Khulna
Print_ISBN :
978-1-4244-2135-0
Electronic_ISBN :
978-1-4244-2136-7
DOI :
10.1109/ICCITECHN.2008.4803109