• DocumentCode
    3749252
  • Title

    An innovative lemmatization technique for Bangla nouns by using longest suffix stripping methodology in decreasing order

  • Author

    Alok Ranjan Pal;Niladri Sekhar Dash;Diganta Saha

  • Author_Institution
    Detp. of Computer Sc. and Engg., College of Engg. & Mgmt, Kolaghat, India
  • fYear
    2015
  • Firstpage
    675
  • Lastpage
    678
  • Abstract
    In this proposed work, an attempt is made to find out the root part from inflected Bangla nouns by applying an innovative technique by using longest suffix stripping methodology in decreasing order. The test data is generated from a Bangla text corpus developed in the TDIL Project of the Govt. of India. The exhaustive suffix list obtained from the research work carried out at Linguistic Research Unit of Indian Statistical Institute, Kolkata while the Bangla non-inflected noun list used in this work is obtained from a wordlist generated by Pashchimbanga Bangla Akademi, Kolkata and available in the net. The algorithm is applied on randomly selected 1273 noun instances and accuracy is achieved around 94%.
  • Keywords
    "Pragmatics","Morphology","Conferences","Algorithm design and analysis","Complexity theory","Computational modeling","Computers"
  • Publisher
    ieee
  • Conference_Titel
    Computing and Network Communications (CoCoNet), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/CoCoNet.2015.7411262
  • Filename
    7411262