DocumentCode :
238749
Title :
Supervised named entity recognition in Assamese language
Author :
Talukdar, Gitimoni ; Borah, Pranjal Protim ; Baruah, Arup
Author_Institution :
Dept. of Comput. Sci. & Eng. & IT, Assam Don Bosco Univ., Guwahati, India
fYear :
2014
fDate :
27-29 Nov. 2014
Firstpage :
187
Lastpage :
191
Abstract :
In each and every natural language nouns play a very important role. A subcategory of noun is proper noun. They represent the names of person, location, organization etc. The task of recognizing the proper nouns in a text and categorizing them into some classes such as person, location, organization and other is called Named Entity Recognition. This is a very essential step of many natural language processing applications that makes the process of information extraction easier. Named Entity Recognition (NER) in most of the Indian languages has been performed using rule-based, supervised and unsupervised approaches. In this work our target language is Assamese, the language spoken by most of the people in North-Eastern part of India and particularly in Assam. In Assamese language, Named Entity Recognition has been performed using the rule based and suffix stripping based approaches. Supervised learning technique is more useful and can be easily adapted to new domains compared to rule based approaches. This paper reports the first work in Assamese NER using a machine learning technique. In this paper Assamese Named Entity Recognition is performed using Naïve Bayes classifier. Since feature extraction plays the most important role in getting better performance in any machine learning technique, in this work our aim is to put forward a description of a few important features related to Assamese NER and performance measure of the system using these features.
Keywords :
Bayes methods; feature extraction; learning (artificial intelligence); natural language processing; pattern classification; Assam; Assamese NER; Assamese language; India; feature extraction; machine learning technique; naive Bayes classifier; natural language processing; supervised named entity recognition; Compounds; Computer science; Context; Educational institutions; Informatics; Organizations; Training; Corpus; Morphology; Naïve Bayes Classifier; Named Entity Recognition; Suffix stripping;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Contemporary Computing and Informatics (IC3I), 2014 International Conference on
Conference_Location :
Mysore
Type :
conf
DOI :
10.1109/IC3I.2014.7019728
Filename :
7019728
Link To Document :
بازگشت