Title :
Can Collective Use Help for Searching?
Author :
Dicheva, Darina ; Dichev, Christo
Author_Institution :
Comput. Sci. Dept., Winston-Salem State Univ., Winston Salem, NC, USA
Abstract :
In this paper we propose a "find similar" method intended to extend the searching capabilities of digital collections targeting educational and academic domains. Given a document, the described algorithm finds similar documents that may be of interest to the user. It exploits the metadata typical for the participatory web. In the adopted model, documents are viewed as objects associated with a set of tags and a set of users who have tagged them, inducing tag-based and user-based similarity. The similarity between two documents is computed as a combination of their tag-base and, user-based cosine similarity and the document recency. We have con-ducted a series of experiments using a CiteULike dump to investigate the properties of the proposed similarity measure. The experimental results indicate that the algorithm exploiting meta-information about the documents provides a good approximation of our understanding of the contextual dependency of the notion of similarity.
Keywords :
Internet; document handling; identification technology; meta data; pattern matching; query formulation; CiteULike dump; Web participatory; academic domain; contextual dependency; digital collection; document recency; educational domain; metadata; searching capability; tag-based similarity; user-based cosine similarity; Accuracy; Bipartite graph; Collaboration; Complex networks; Humans; Tagging; Vectors; finding similar documents; folksonomy; information retrieval;
Conference_Titel :
Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4577-1827-4
DOI :
10.1109/CyberC.2011.14