مرکز منطقه ای اطلاع رساني علوم و فناوري - Notice of Violation of IEEE Publication Principles<BR>Sentence Similarity Computation Based on Wordnet and Corpus Statistics

Abstract :

Notice of Violation of IEEE Publication Principles

"Sentence Similarity Computation Based on WordNet and Corpus Statistics,"
by P. Selvi, and N.P. Gopalan,
in the Proceedings of the International Conference on Conference on Computational Intelligence and Multimedia Applications, 2007, vol.1, pp.9-14, 13 Dec. 2007

After careful and considered review of the content and authorship of this paper by a duly constituted expert committee, this paper has been found to be in violation of IEEE\´s Publication Principles.

This paper contains significant portions of original text from the papers cited below. The original text was copied without attribution (including appropriate references to the original author(s) and/or paper titles) and without permission. Further, the expert committee found only one of the two authors, Mr. P. Selvi, in violation of the IEEE Publication Principles.

"A Method for Measuring Sentence Similarity and its Application to Conversational Agents,"
by J. Li, Z. Bandar, D. McLean, and J. O\´Shea,
in the Proceedings of the 17th International Florida Artificial Intelligence Research Society Conference (FLAIRS 2004), pages 820-825, Miami Beach, FL. AAAI Press.

"A Combination-based Semantic Similarity Measure using Multiple Information Sources,"
by H.A. Nguyen and H.Al-Mubaid,
in the Proceedings of the 2006 IEEE International Conference on Information Reuse and Integration, pp.617-621, 16 Sept. 2006

"Sentence Similarity Based on Semantic Nets and Corpus Statistics,"
by Y. Li, D. McLean, Z.A. Bandar, J.D. O\´Shea, and K. Crockett,
in the IEEE Transactions on Knowledge and Data Engineering, vol.18, no.8, pp.1138-1150, Aug. 2006

"Semantic Similarity of Short Texts,"
by A. Islam, D. Inkpen,
in Proceedings of the International ConferenceMeasures of text similarity have been used for a long time in applications in natural language processing and related areas such as text mining, Web p- age retrieval, and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high-dimensional space and are consequently inefficient, require human input, and are not adaptable to some application domains. This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition.