• Title of article

    A machine-learning approach to negation and speculation detection in clinical texts

  • Author/Authors

    Noa P. Cruz D?az، نويسنده , , Manuel J. Ma?a L?pez، نويسنده , , Jacinto Mata V?zquez، نويسنده , , Victoria Pach?n ?lvarez، نويسنده ,

  • Issue Information
    ماهنامه با شماره پیاپی سال 2012
  • Pages
    13
  • From page
    1398
  • To page
    1410
  • Abstract
    Detecting negative and speculative information is essential in most biomedical text-mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine-learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the BioScope corpus, a freely available resource consisting of medical and biological texts: full-length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our systemʹs results outperformed the results obtained by these two systems. In the signal detection task, the F-score value was 97.3% in negation and 94.9% in speculation. In the scope-finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an F score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining F scores of 90.9% in negation and 71.9% in speculation.
  • Keywords
    Machine learning , biomedical information , Natural language processing
  • Journal title
    Journal of the American Society for Information Science and Technology
  • Serial Year
    2012
  • Journal title
    Journal of the American Society for Information Science and Technology
  • Record number

    994685