DocumentCode
3574526
Title
Named entity recognition for tamil biomedical documents
Author
Betina Antony, J. ; Mahalakshmi, G.S.
Author_Institution
Dept. of Comput. Sci. & Eng., Anna Univ., Chennai, India
fYear
2014
Firstpage
1571
Lastpage
1577
Abstract
Valuable Information about tamil traditional medicines are available in various forms like books, magazines and websites. These instructions are however very large and unstructured. Our system focuses on constructing a NER identification module using SVM classifier to identify named entities and to classify them into their corresponding categories. The two main categories considered are name of disorders and name of ingredients used. The system uses features such as unigrams/bigrams, case markers, substring clues and tf-idf score to classify the entities into their classes. These named entities are stored in the NE Dictionary based on their categories.
Keywords
document handling; natural language processing; pattern classification; support vector machines; NE Dictionary; NER identification module; SVM classifier; Tamil biomedical documents; Tamil traditional medicines; named entity identification; named entity recognition; Computers; Dictionaries; Feature extraction; Hidden Markov models; Natural language processing; Support vector machines; Biomedical NER; SVM classification; Siddha documents; Tamil Biomedical Documents;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuit, Power and Computing Technologies (ICCPCT), 2014 International Conference on
Print_ISBN
978-1-4799-2395-3
Type
conf
DOI
10.1109/ICCPCT.2014.7055016
Filename
7055016
Link To Document