Title :
Stemmer for resource scarce language using string similarity measure
Author :
Debbarma, Abhijit ; Purkayastha, Bs ; Bhattacharya, Pallab
Author_Institution :
Dept. of Inf. Technol., Ramkrishna Mahavidyalaya, Unakoti, India
Abstract :
This paper a work in progress describes a stemming of Kokborok language using a statistical approach. Stemming study of Kokborok is a new topic of research. Many stemming algorithms have been proposed for various languages. But the major work has been done only for English language. In recent times we have seen interest for non English languages too. However, very limited or no computational work has been observed for Kokborok language, a dialect spoken in the Tripura, India. Kokborok is a highly inflectional language. Linguistic knowledge and resources forms one of the basic requirement in building rule based stemmer. Kokborok a new language in this area of computational study suffer from this limitation. This work tries to build a Kokborok stemmer using a statistical approach based on string measure.
Keywords :
knowledge based systems; natural language processing; statistical analysis; English language; India; Kokborok language; Tripura; inflectional language; linguistic knowledge; resource scarce language; rule based stemmer; statistical approach; stemm algorithm; string similarity measure; String Similarity; Supervised learning; kokborok; nlp; stemmer;
Conference_Titel :
Optimization, Reliabilty, and Information Technology (ICROIT), 2014 International Conference on
Conference_Location :
Faridabad
Print_ISBN :
978-1-4799-3958-9
DOI :
10.1109/ICROIT.2014.6798299