Title :
Frequency based named entity recognition system for under resource language
Author :
Debbarma, Abhijit ; Bhattacharya, Pallab ; Purkayastha, B.S.
Author_Institution :
Dept. of IT, Ramkrishna Mahavidyalaya Kailashahar, Unakoti, India
Abstract :
This paper tries to study the issues and challenges for developing a Named Entity Recognition (NER) system for a resource scarce language of north east India. Kokborok a language spoken in the state of Tripura is taken as the target language in developing our NER system. Kokborok is an under resource language and not much digital work is available. We have used the frequency based approach to test our work which gave us a satisfactory result. As this is the first NER system being studied upon for this language we consider this to be our baseline NER system for future research in this area.
Keywords :
natural language processing; Kokborok; Tripura; frequency based NER system; frequency based named entity recognition system; northeast India; resource scarce language; under resource language; Dictionaries; Educational institutions; Hidden Markov models; Instruments; Natural language processing; Support vector machines; Tagging; Kokborok; NER; NLP; Named entity recognition; Under resourse language;
Conference_Titel :
Control, Instrumentation, Communication and Computational Technologies (ICCICCT), 2014 International Conference on
Conference_Location :
Kanyakumari
Print_ISBN :
978-1-4799-4191-9
DOI :
10.1109/ICCICCT.2014.6993076