Title :
A Comparative Study of Named Entity Recognition for Hindi Using Sequential Learning Algorithms
Author :
Krishnarao, Awaghad Ashish ; Gahlot, Himanshu ; Srinet, Amit ; Kushwaha, D.S.
Author_Institution :
Motilal Nehru Nat. Inst. of Technol., Allahabad
Abstract :
Through this paper we present a comparative study of two sequential learning algorithms viz. Conditional random fields (CRF) and static vector machine (SVM) applied to the task of named entity recognition in Hindi. Since the features used are language independent hence the same procedure can be applied to tag the named entities for other Indian languages like Telgu, Bengali, Marathi etc. We have used CRF++ for implementing CRF algorithm and Yamcha for implementing SVM algorithm. The results show a superiority of CRF over SVM and are just a little lower than the highest results achieved for this task which is due to the non-usage of any pre-processing and post-processing steps. The system makes use of the contextual information of words along with various language independent features to label the named entities (NEs). We first present the two systems (CRF and SVM) and then compare their results for the same data.
Keywords :
learning (artificial intelligence); natural language processing; random processes; support vector machines; CRF++; Hindi entity recognition; SVM; conditional random fields; contextual information; named entities; sequential learning algorithms; static vector machine; Costs; Entropy; Labeling; Machine learning; Natural languages; Predictive models; Shape; Support vector machines; Testing;
Conference_Titel :
Advance Computing Conference, 2009. IACC 2009. IEEE International
Conference_Location :
Patiala
Print_ISBN :
978-1-4244-2927-1
Electronic_ISBN :
978-1-4244-2928-8
DOI :
10.1109/IADCC.2009.4809179