Title :
Web 2.0 social bookmark selection for tag clustering
Author :
Kumar, Sahoo Subhendu ; Inbarani, H.H.
Author_Institution :
Dept. Comput. Sci., Periyar Univ., Salem, India
Abstract :
Tagging is a popular way to annotate web 2.0 web sites. A tag is any user-generated word or phrase that helps to organize web 2.0 content. The current hype around web 2.0 applications, poses several important challenges for future data and web mining methods. An important challenge of Web 2.0 is the fact that a large amount of data has been generated over a short period. Clustering the tag data is very tedious since the tag space is very large in several social book marking web sites. So, instead of clustering the whole tag space of Web 2.0 data, some tags frequent enough in the tag space can be selected for clustering by applying feature selection techniques. The goal of feature selection is to determine a marginal bookmarked URL subset from a Web 2.0 data while retaining a suitably high accuracy in representing the original bookmarks. Tag clustering is the process of grouping similar tags into the same cluster and is important for the success of collaborative tagging services. In this paper, Unsupervised Quick Reduct feature selection algorithm is applied to find a set of most commonly tagged bookmarks and then clustering techniques such as Soft rough fuzzy clustering and Rough K-Means algorithms are applied for clustering of user generated tags and the performance of these clustering approaches are illustrated in this paper.
Keywords :
Internet; Web sites; data mining; fuzzy set theory; pattern clustering; rough set theory; Web 2.0 Web site; Web 2.0 social bookmark selection; Web mining; collaborative tagging service; data mining; feature selection technique; marginal bookmarked URL subset; rough K-means algorithm; social book marking Web site; soft rough fuzzy clustering; unsupervised quick reduct feature selection algorithm; user generated tag clustering; user-generated phrase; user-generated word; Algorithm design and analysis; Clustering algorithms; Indexes; Informatics; Pattern recognition; Tagging; Web 2.0; Bookmark selection; Clustering; Fuzzy soft Rough clustering; Rough K means; Unsupervised Quick Reduct;
Conference_Titel :
Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on
Conference_Location :
Salem
Print_ISBN :
978-1-4673-5843-9
DOI :
10.1109/ICPRIME.2013.6496724