• DocumentCode
    3574526
  • Title

    Named entity recognition for tamil biomedical documents

  • Author

    Betina Antony, J. ; Mahalakshmi, G.S.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Anna Univ., Chennai, India
  • fYear
    2014
  • Firstpage
    1571
  • Lastpage
    1577
  • Abstract
    Valuable Information about tamil traditional medicines are available in various forms like books, magazines and websites. These instructions are however very large and unstructured. Our system focuses on constructing a NER identification module using SVM classifier to identify named entities and to classify them into their corresponding categories. The two main categories considered are name of disorders and name of ingredients used. The system uses features such as unigrams/bigrams, case markers, substring clues and tf-idf score to classify the entities into their classes. These named entities are stored in the NE Dictionary based on their categories.
  • Keywords
    document handling; natural language processing; pattern classification; support vector machines; NE Dictionary; NER identification module; SVM classifier; Tamil biomedical documents; Tamil traditional medicines; named entity identification; named entity recognition; Computers; Dictionaries; Feature extraction; Hidden Markov models; Natural language processing; Support vector machines; Biomedical NER; SVM classification; Siddha documents; Tamil Biomedical Documents;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuit, Power and Computing Technologies (ICCPCT), 2014 International Conference on
  • Print_ISBN
    978-1-4799-2395-3
  • Type

    conf

  • DOI
    10.1109/ICCPCT.2014.7055016
  • Filename
    7055016