DocumentCode :
162605
Title :
An Automated System for Tamil Named Entity Recognition Using Hybrid Approach
Author :
Srinivasagan, K.G. ; Suganthi, S. ; Jeyashenbagavalli, N.
Author_Institution :
Nat. Eng. Coll., Kovilpatti, India
fYear :
2014
fDate :
6-7 March 2014
Firstpage :
435
Lastpage :
439
Abstract :
Named Entity Recognition is the process of identifying and recognizing named entities such as person, organization, location, date, time and money in the text documents. Named Entity Recognition is a subtask of Information Extraction. Information Extraction is the process of extracting the relevant data from documents. It is one of the research areas in Natural language processing. In this project implement a named entity recognizer using the hybrid approach that uses both Rule based and Hidden Markov Model in succession, which identifies only person, location and organization names respectively. Input data for proposed Named Entity Recognition system is any text document related to the any domain but limited size corpora respectively in Tamil language. In this system are tagging each word by using POS tagger and then imposing certain rules such as Lexical features and use some Gazetteers. HMM model using E-M algorithm is taken output data from trained as input to recognition system. The main purpose of this system identifies unknown entities and solves the problem of same name entity in different positions in the same document. The system is measuring the recall and precision parameters calculate the F-measure score. Goal of this project is to improve the performance of NER system to achieving high F-measure score.
Keywords :
expectation-maximisation algorithm; hidden Markov models; natural language processing; text analysis; E-M algorithm; F-measure score; HMM model; NER system performance improvement; POS tagger; automated tamil named entity recognition system; gazetteers; hidden Markov model; hybrid approach; information extraction; input data; lexical features; location names; named entity identification; natural language processing; organization names; output data; person names; precision parameter; recall parameter; relevant data extraction; rule-based approach; same name entity problem; text documents; unknown entities; word tagging; Artificial neural networks; Data mining; Hidden Markov models; Natural language processing; Organizations; Tagging; HMM; Morphological analyzer; NER; NLP; POS tagging;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Computing Applications (ICICA), 2014 International Conference on
Conference_Location :
Coimbatore
Type :
conf
DOI :
10.1109/ICICA.2014.95
Filename :
6965087
Link To Document :
بازگشت