• DocumentCode
    2456478
  • Title

    Discovering and Counting Biomedical Verbs

  • Author

    Waxmonsky, Sonjia ; Goldsmith, John ; Rzhetsky, Andrey

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Chicago, Chicago, IL, USA
  • fYear
    2010
  • fDate
    12-14 Dec. 2010
  • Firstpage
    975
  • Lastpage
    978
  • Abstract
    In the biomedical domain verbs tend to be drawn from a smaller and more regular vocabulary than standard English and are especially informative for predicting the semantic class of the entities involved. With this motivation we investigate the discovery and counting of verbs in the biomedical domain by applying part-of-speech tagging and unsupervised morphological analysis. Our goal is to automatically discover an almost complete set of relevant verbs and group lexical variants of the same verb into semantic classes. Additionally, we analyze differences in verb usage between biomedical and standard English, and between general biomedical texts and a specific sub domain. This verb frequency data can serve as a basis for named entity recognition in biomedical texts.
  • Keywords
    medical administrative data processing; medical computing; natural language processing; set theory; speech recognition; text analysis; unsupervised learning; biomedical domain verbs; biomedical texts; lexical variants; named entity recognition; part-of-speech tagging; semantic classes; standard English verb; unsupervised morphological analysis; verb frequency data; Biomedical measurements; Frequency measurement; Protein engineering; Proteins; Semantics; Tagging; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-9211-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2010.155
  • Filename
    5708979