DocumentCode :
174542
Title :
A concept based clustering model for document similarity
Author :
Veena, G. ; Lekha, N.K.
Author_Institution :
Dept. of Comput. Sci. & Applic., Amrita Vishwa Vidyapeetham, Kollam, India
fYear :
2014
fDate :
26-28 Aug. 2014
Firstpage :
118
Lastpage :
123
Abstract :
A lot of research work has been done in the area of concept mining and document similarity in past few years. But all these works were based on the statistical analysis of keywords. The major challenge in this area involves the preservation of semantics of the terms or phrases. Our paper proposes a graph model to represent the concept in the sentence level. The concept follows a triplet representation. A modified DB scan algorithm is used to cluster the extracted concepts. This cluster forms a belief network or probabilistic network. We use this network for extracting the most probable concepts in the document. In this paper we also proposes a new algorithm for document similarity.
Keywords :
belief networks; document handling; graph theory; pattern clustering; DBSCAN algorithm; belief network; concept based clustering model; document similarity; graph model; probabilistic network; triplet representation; Accuracy; Analytical models; Clustering algorithms; Nanofluidics; Nanomaterials; Probability; Semantics; concept based Extended DB scan algorithm; concept mining model; document similarity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Science & Engineering (ICDSE), 2014 International Conference on
Conference_Location :
Kochi
Print_ISBN :
978-1-4799-6870-1
Type :
conf
DOI :
10.1109/ICDSE.2014.6974622
Filename :
6974622
Link To Document :
بازگشت