Title :
Anaphora Resolution in Hindi Documents
Author :
Agarwal, Sachin ; Srivastava, Manaj ; Agarwal, Pallavi ; Sanyal, Ratna
Author_Institution :
Indian Inst. of Inf. Technol., Allahabad
fDate :
Aug. 30 2007-Sept. 1 2007
Abstract :
This paper presents anaphora resolution as a technique of semantic analysis of text documents written in Hindi language. The focus is on texts that mainly employ simple sentences, such as children´s stories, short essays, etc. The technique works by locating sentences in the text that are semantically related through anaphors, analyzing their semantics and exploiting the latter for resolving referents of the respective anaphors. The approach used here is based on matching constraints for the grammatical attributes of different words. The algorithm for anaphora resolution has been tested extensively. The accuracy of anaphora resolution is nearly 96% for simple sentences and for compound and complex sentences; the accuracy is of the order of 80%. The causes of the errors are analyzed and possible techniques for improvements are discussed.
Keywords :
grammars; knowledge representation; natural languages; pattern matching; text analysis; Hindi language; anaphora resolution; knowledge representation; semantic text document analysis; Algorithm design and analysis; Data mining; Genetics; Information retrieval; Information technology; Natural languages; Performance analysis; Speech; Tellurium; Testing;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
DOI :
10.1109/NLPKE.2007.4368070