• DocumentCode
    3300422
  • Title

    Resources for Nepali Word Sense Disambiguation

  • Author

    Shrestha, Niraj ; Hall, Patrick A V ; Bista, Sanat K.

  • Author_Institution
    Inf. & Language, Process. Res. Lab., Kathmandu Univ., Kathmandu
  • fYear
    2008
  • fDate
    19-22 Oct. 2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Word sense disambiguation (WSD) is a process of identifying proper meaning of words that may have multiple meanings. It is regarded as one of the most challenging problems in the field of natural language processing (NLP). Nepali Language also has words that have multiple meanings, thus giving rise to the problem of WSD in it. In this paper, we investigate the impact of NLP resources like morphology analyzer (MA) and machine readable dictionary (MRD) in ambiguity resolution. Our results show that the accuracy in WSD is better with the availability of NLP resources like morph analyzer, MRD etc. Lesk algorithm has been used to solve WSD problem using a sample Nepali WordNet containing few sets of Nepali nouns and the system is able to disambiguate these nouns only. The system was tested on a small set of data with limited number of nouns. The accuracy reading was between 50% - 70% depending on the sample data provided. When the same data was tested through manual morph analysis, the accuracy was seen to be considerably high (80%).
  • Keywords
    dictionaries; natural language processing; word processing; Lesk algorithm; Nepali word sense disambiguation; machine readable dictionary; morphology analyzer; natural language processing; Availability; Computer science; Dictionaries; Information retrieval; Morphology; Natural language processing; Natural languages; Software systems; Speech processing; System testing; Language; Lesk Algorithm; Nepali WordNet; WSD;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4515-8
  • Electronic_ISBN
    978-1-4244-2780-2
  • Type

    conf

  • DOI
    10.1109/NLPKE.2008.4906758
  • Filename
    4906758