DocumentCode :
691729
Title :
Paraphrase identification in short texts using grammar patterns
Author :
Vaishnavi, V. ; Saritha, M. ; Milton, R.S.
Author_Institution :
Dept. of Comput. Sci. & Eng., SSN Coll. of Eng., Chennai, India
fYear :
2013
fDate :
25-27 July 2013
Firstpage :
472
Lastpage :
477
Abstract :
We can determine whether two texts are paraphrases of each other by finding out the extent to which the texts are similar. The typical lexical matching technique works by matching the sequence of tokens between the texts to recognize paraphrases, and fails when different words are used to convey the same meaning. We can improve this simple method by combining lexical with syntactic or semantic representations of the input texts. The present work makes use of syntactical information in the texts and computes the similarity between them using word similarity measures based on WordNet and lexical databases. The texts are converted into a unified semantic structural model through which the semantic similarity of the texts is obtained. An approach is presented to assess the semantic similarity and the results of applying this approach is evaluated using the Microsoft Research Paraphrase (MSRP) Corpus.
Keywords :
natural language processing; pattern matching; text analysis; MSRP corpus; Microsoft research paraphrase corpus; WordNet database; grammar patterns; lexical database; lexical matching technique; lexical representation; paraphrase identification; paraphrase recognition; semantic representation; semantic structural model; short texts; syntactic representation; syntactical information; word similarity measures; Equations; Grammar; Information technology; Market research; Natural languages; Semantics; Syntactics; Lexical database; MSRP; Paraphrase; Semantic similarity; Semantic structural model; WordNet;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Recent Trends in Information Technology (ICRTIT), 2013 International Conference on
Conference_Location :
Chennai
Type :
conf
DOI :
10.1109/ICRTIT.2013.6844249
Filename :
6844249
Link To Document :
بازگشت