• DocumentCode
    652125
  • Title

    Exploiting External Data for Training a Cancer Clause Classifier

  • Author

    Sangsoo Nam ; Sung-Hyon Myaeng

  • Author_Institution
    Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
  • fYear
    2013
  • fDate
    9-11 Sept. 2013
  • Firstpage
    439
  • Lastpage
    446
  • Abstract
    Automatically detecting cancer-revealing clauses in a medical report can be helpful for medical experts in various tasks such as cancer staging. While the problem of detecting such clauses can be formulated as classifying individual clauses into cancer and non-cancer categories, standard classification algorithms suffer from the fact that the training data contains much more non-cancer clauses than cancer clauses in radiology reports. In order to alleviate the data imbalance and sparseness problems related to the radiology reports, we attempt to use cancer-related external data in term weighting at the training stage. Our experiment shows that this approach indeed changes term feature statistics and improve effectiveness of the classifier.
  • Keywords
    learning (artificial intelligence); medical information systems; pattern classification; radiology; text analysis; cancer clause classifier training; cancer-related external data; cancer-revealing clauses detecting; data imbalance; data sparseness problems; feature statistics; medical report; radiology reports; text classification; Biomedical imaging; Cancer; Classification algorithms; Radiology; Text categorization; Training; Training data; external data; imbalanced data; radiology report; term weighting scheme; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Healthcare Informatics (ICHI), 2013 IEEE International Conference on
  • Conference_Location
    Philadelphia, PA
  • Type

    conf

  • DOI
    10.1109/ICHI.2013.60
  • Filename
    6680507