• DocumentCode
    2747021
  • Title

    Gene Name Automatic Recognition in Biomedical Literature

  • Author

    Yang, Zhihao ; Lin, Hongfei ; Zhao, Jing

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Dalian Univ. of Technol.
  • Volume
    2
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    9391
  • Lastpage
    9395
  • Abstract
    Identifying gene names in biomedical texts is regarded as a crucial step for text mining. Our approach is a combination of dictionary based approach and machine learning based approach. Based on a gene name dictionary, an edit distance approximate string searching algorithm was used to improve the recall rate of gene recognition which is greatly lowered due to a lack of standard gene-naming conventions. Then the naive Bayes and SVM classifiers were adopted to filter out false recognitions, therefore improving the precision rate of gene recognition. The experiments show that classifiers greatly improve precision with slight loss of recall, resulting in a much better F-score (from 53.7% to 67.6%)
  • Keywords
    biology computing; classification; data mining; dictionaries; genetics; learning (artificial intelligence); string matching; text analysis; SVM classifier; biomedical literature; biomedical texts; edit distance; gene name automatic recognition; gene name dictionary; machine learning; naive Bayes classifier; string searching; text mining; Biomedical engineering; Computer science; Dictionaries; Electronic mail; Epidermis; Machine learning; Support vector machine classification; Support vector machines; Text mining; Text recognition; Edit Distance; Naive Bayes Classifier; SVM Classifier; Text Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on
  • Conference_Location
    Dalian
  • Print_ISBN
    1-4244-0332-4
  • Type

    conf

  • DOI
    10.1109/WCICA.2006.1713819
  • Filename
    1713819