Title :
Boosting text extraction from biomedical images using text region detection
Author :
Xu, Songhua ; Krauthammer, Michael
Author_Institution :
Oak Ridge Nat. Lab., Oak Ridge, TN, USA
Abstract :
In this paper, we show that domain-optimized text detection in biomedical images is important for boosting text extraction recall via off-the-shelf OCR engines. Methodologically, we contrast OCR performance when processing raw biomedical images, compared to preprocessing those images, and performing OCR on detected image text regions only. To quantify OCR extraction results, we rely on a gold standard image text corpus with manually identified image text strings. To demonstrate the positive effect on biomedical image retrieval, we apply image text detection and extraction to a large corpus of biomedical images in the Yale Image Finder system. We show that improved text extraction results in the retrieval of a larger number of relevant images for a set of domain-relevant keyword searches.
Keywords :
feature extraction; medical image processing; optical character recognition; text analysis; Yale Image Finder system; biomedical image retrieval; domain-relevant keyword searches; image text string; off-the-shelf OCR engines; text extraction; text region detection; Biomedical imaging; Engines; Gold; Guidelines; Image retrieval; Optical character recognition software; Videos;
Conference_Titel :
Biomedical Sciences and Engineering Conference (BSEC), 2011
Conference_Location :
Knoxville, TN
Print_ISBN :
978-1-61284-411-4
Electronic_ISBN :
978-1-61284-410-7
DOI :
10.1109/BSEC.2011.5872319