DocumentCode
3001165
Title
Comparative analysis of similarity measures for sentence level semantic measurement of text
Author
Saad, Shaharil Mad ; Kamarudin, Siti Sakira
Author_Institution
Product Quality & Reliability Eng., MIMOS Berhad, Kuala Lumpur, Malaysia
fYear
2013
fDate
Nov. 29 2013-Dec. 1 2013
Firstpage
90
Lastpage
94
Abstract
The accuracy of similarity measurement between sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. This paper focuses on calculating semantic similarities between sentences and performing a comparative analysis among identified similarity measurement techniques. Comparison between three popular similarity measurements which are Jaccard, Cosine and Dice similarity measures has been conducted. The performance of each identified measurement was evaluated and recorded. In this paper, we use a large lexical database of English known as WordNet to calculate the word-to-word semantic similarity. The result of this research concludes that the Jaccard and Dice performs better in measuring the semantic similarity between sentences.
Keywords
database management systems; natural language processing; text analysis; Cosine similarity measure; Dice similarity measure; English lexical database; Jaccard similarity measure; WordNet; comparative analysis; sentence semantic similarity; similarity measurement technique; text sentence level semantic measurement; word-to-word semantic similarity; Benchmark testing; Conferences; Control systems; Information retrieval; Measurement techniques; Semantics; Vectors; Semantic Similarity; Sentence Similarity; Similarity Measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Control System, Computing and Engineering (ICCSCE), 2013 IEEE International Conference on
Conference_Location
Mindeb
Print_ISBN
978-1-4799-1506-4
Type
conf
DOI
10.1109/ICCSCE.2013.6719938
Filename
6719938
Link To Document