• DocumentCode
    3474635
  • Title

    Biomedical Term Disambiguation: An Application to Gene-Protein Name Disambiguation

  • Author

    Al-Mubaid, Hisham ; Chen, Ping

  • Author_Institution
    Houston Univ.
  • fYear
    2006
  • fDate
    10-12 April 2006
  • Firstpage
    606
  • Lastpage
    612
  • Abstract
    The huge volumes of biomedical texts available online drives the increasing need for automated techniques to analyze and extract knowledge from these repositories of information. Resolving the ambiguity in biological terms in these texts is an important step for developing efficient knowledge discovery techniques. In this paper, we present a new method for biomedical term disambiguation in biomedical texts. The method is based on machine learning and can be viewed as a word classification task. We evaluated the method on gene-protein name disambiguation using Medline abstracts from years 1999-2003 containing about 3000 to 6000 gene and protein names. The technique is effective in disambiguating gene and protein names, achieving impressive accuracy, precision, and recall, with accuracy approaching about 90%, and outperforming the recently published results on this problem. Our technique is also applicable for the general problem of named entity disambiguation
  • Keywords
    classification; data mining; learning (artificial intelligence); medical information systems; text analysis; Medline abstract; biomedical term disambiguation; biomedical text mining; gene-protein name disambiguation; knowledge discovery; machine learning; word classification; Abstracts; Bioinformatics; Data mining; Drives; Information analysis; Lakes; Machine learning; Natural language processing; Proteins; Text mining; Term disambiguation; biomedical text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: New Generations, 2006. ITNG 2006. Third International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    0-7695-2497-4
  • Type

    conf

  • DOI
    10.1109/ITNG.2006.39
  • Filename
    1611671