DocumentCode :
3328672
Title :
Document similarity detection using semantic social network analysis on RDF citation graph
Author :
Mahmood, Q. ; Qadir, Muhammad Abdul ; Afzal, Muhammad Tanvir
Author_Institution :
Center for Distrib. & Semantic Comput., Mohammad Ali Jinnah Univ., Islamabad, Pakistan
fYear :
2013
fDate :
9-10 Dec. 2013
Firstpage :
1
Lastpage :
6
Abstract :
Document similarity identification is one of the most significant problems of knowledge discovery and information retrieval. One way to perform these similarity measures is to analyze a citation graph of research papers. If we have document citation information in the form of RDF graph, how we may identify the document similarity measures by using social network analysis techniques? We have answered this question by applying semantic social network analysis techniques on RDF citation graphs of research papers to identify the pair wise similarity between these papers. For performing social network analysis we have used classes of centrality degree and closeness centrality from SemSNA ontology. Concept of minimum cut/maximum flow from graph theory is used for quantification of similarity measure. In our experiment we have used Citeseer data set; it is found that our results are promising as compared to manual similarity measures by human for a subset of this data set. Our results are also encouragingly comparable to other citation link analysis techniques as well as content based similarity measures; this is the reason that we have focused on RDF citation based similarity measure. In future we are looking forward to use some citation ontology (such as CITO) to improve RDF graph construction for our proposed similarity measure technique.
Keywords :
citation analysis; document handling; graph theory; social networking (online); CITO; Citeseer data set; RDF citation graph; RDF graph construction; SemSNA ontology; centrality degree; citation link analysis techniques; citation ontology; closeness centrality; content based similarity measures; document citation information; document similarity detection; document similarity identification; document similarity measures; graph theory; information retrieval; knowledge discovery; maximum flow; minimum cut; pairwise similarity identification; research papers; semantic social network analysis techniques; similarity measure quantification; Algorithm design and analysis; Fluid flow measurement; Manuals; Ontologies; Resource description framework; Semantics; Social network services; RDF; citation graph; document similarity; semantic social network analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Technologies (ICET), 2013 IEEE 9th International Conference on
Conference_Location :
Islamabad
Print_ISBN :
978-1-4799-3456-0
Type :
conf
DOI :
10.1109/ICET.2013.6743548
Filename :
6743548
Link To Document :
بازگشت