DocumentCode
169801
Title
Statistical Analysis of ML-Based Paraphrase Detectors with Lexical Similarity Metrics
Author
El-Alfy, El-Sayed M.
Author_Institution
Coll. of Comput. Sci. & Eng., King Fahd Univ. of Pet. & Miner., Dhahran, Saudi Arabia
fYear
2014
fDate
6-9 May 2014
Firstpage
1
Lastpage
5
Abstract
Paraphrase detection has several important applications in natural language processing. Examples of such applications include language translation, text summarization, question answering, plagiarism detection, and online information retrieval. A number of metrics have been proposed in the literature to quantify the textual similarity between two sentences. However, the accuracy of utilizing each similarity metric alone in detecting paraphrases is very low. Though some machine learning (ML) techniques have been deployed for paraphrase detection, there is no known study that intensively benchmarks their performance on this problem under similar conditions. In this paper, we evaluate the utility of integrating five lexical similarity metrics with three standard machine learning paradigms to detect paraphrases. We apply statistical tests to compare and benchmark the relative significance of the adopted ML-based paraphrase detectors on different datasets.
Keywords
learning (artificial intelligence); natural language processing; statistical analysis; ML-based paraphrase detectors; lexical similarity metrics; machine learning paradigms; natural language processing; paraphrase detection; statistical analysis; Educational institutions; Kernel; Measurement; Niobium; Support vector machines; Training; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Applications (ICISA), 2014 International Conference on
Conference_Location
Seoul
Print_ISBN
978-1-4799-4443-9
Type
conf
DOI
10.1109/ICISA.2014.6847467
Filename
6847467
Link To Document