DocumentCode :
652125
Title :
Exploiting External Data for Training a Cancer Clause Classifier
Author :
Sangsoo Nam ; Sung-Hyon Myaeng
Author_Institution :
Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
fYear :
2013
fDate :
9-11 Sept. 2013
Firstpage :
439
Lastpage :
446
Abstract :
Automatically detecting cancer-revealing clauses in a medical report can be helpful for medical experts in various tasks such as cancer staging. While the problem of detecting such clauses can be formulated as classifying individual clauses into cancer and non-cancer categories, standard classification algorithms suffer from the fact that the training data contains much more non-cancer clauses than cancer clauses in radiology reports. In order to alleviate the data imbalance and sparseness problems related to the radiology reports, we attempt to use cancer-related external data in term weighting at the training stage. Our experiment shows that this approach indeed changes term feature statistics and improve effectiveness of the classifier.
Keywords :
learning (artificial intelligence); medical information systems; pattern classification; radiology; text analysis; cancer clause classifier training; cancer-related external data; cancer-revealing clauses detecting; data imbalance; data sparseness problems; feature statistics; medical report; radiology reports; text classification; Biomedical imaging; Cancer; Classification algorithms; Radiology; Text categorization; Training; Training data; external data; imbalanced data; radiology report; term weighting scheme; text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Healthcare Informatics (ICHI), 2013 IEEE International Conference on
Conference_Location :
Philadelphia, PA
Type :
conf
DOI :
10.1109/ICHI.2013.60
Filename :
6680507
Link To Document :
بازگشت