Title :
Learning Recognition of Ambiguous Proper Names in Hindi
Author :
Sinha, Rai Mahesh K
Author_Institution :
MTU, Noida, India
Abstract :
An ambiguous proper name is a name which is also a valid dictionary word with a meaning of its own when used in the text. For example in English, the word ´bush´ in ´Mr. Bush´ is a proper name whereas in ´a dense bush´ it is a lexical entity. Almost all proper names in Hindi have a meaning and find an entry in the dictionary. Recognition of named entities finds wide application in MT, IR and several other NLP tasks. While there have been a number of investigations on Hindi NER in general, no work has been reported exclusively on ambiguous proper nouns which are more difficult to deal with. This paper presents a methodology for recognizing ambiguous proper names in Hindi using hybridization of a rule-base and statistical CRF based machine learning using morphological and context features. The methodology yields a F-score of 71.6%.
Keywords :
learning (artificial intelligence); natural language processing; statistical analysis; Hindi; ambiguous proper names; context features; dictionary word; learning recognition; lexical entity; machine learning; morphological features; proper nouns; rule base CRF; statistical CRF; Conferences; Context; Dictionaries; Information processing; Machine learning; Semantics; Training; Hindi ambiguous proper names; NLP; named entity recognition; semi-supervised hybrid learning; sense disambiguation; sparse corpus;
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
DOI :
10.1109/ICMLA.2011.87