Title :
Label the many with a few: Semi-automatic medical image modality discovery in a large image collection
Author :
Vajda, Szilard ; Daekeun You ; Antani, Sameer K. ; Thoma, George R.
Author_Institution :
Nat. Libr. of Med., Nat. Inst. of Health, Bethesda, MD, USA
Abstract :
In this paper we present a fast and effective method for labeling images in a large image collection. Image modality detection has been of research interest for querying multimodal medical documents. To accurately predict the different image modalities using complex visual and textual features, we need advanced classification schemes with supervised learning mechanisms and accurate training labels. Our proposed method, on the other hand, uses a multiview-approach and requires minimal expert knowledge to semi-automatically label the images. The images are first projected in different feature spaces, and are then clustered in an unsupervised manner. Only the cluster representative images are labeled by an expert. Other images from the cluster “inherit” the labels from these cluster representatives. The final label assigned to each image is based on a voting mechanism, where each vote is derived from different feature space clustering. Through experiments we show that using only 0.3% of the labels was sufficient to annotate 300,000 medical images with 49.95% accuracy. Although, automatic labeling is not as precise as manual, it saves approximately 700 hours of manual expert labeling, and may be sufficient for next-stage classifier training. We find that for this collection accuracy improvements are feasible with better disparate feature selection or different filtering mechanisms.
Keywords :
document image processing; feature extraction; feature selection; image classification; image retrieval; learning (artificial intelligence); medical image processing; pattern clustering; accuracy improvement; classification schemes; cluster representative image labelling; complex textual features; complex visual features; feature selection; feature space clustering; feature spaces; filtering mechanisms; image clustering; image modality detection; image projection; large-image collection; multimodal medical document query; multiview-approach; next-stage classifier training; semiautomatic image labeling; semiautomatic medical image modality discovery; supervised learning mechanisms; training labels; unsupervised mechanism; voting mechanism; Biomedical imaging; Computed tomography; Feature extraction; Labeling; Manuals; Visualization; X-ray imaging;
Conference_Titel :
Computational Intelligence in Healthcare and e-health (CICARE), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
DOI :
10.1109/CICARE.2014.7007850