DocumentCode :
3502743
Title :
Boosting text extraction from biomedical images using text region detection
Author :
Xu, Songhua ; Krauthammer, Michael
Author_Institution :
Oak Ridge Nat. Lab., Oak Ridge, TN, USA
fYear :
2011
fDate :
15-17 March 2011
Firstpage :
1
Lastpage :
4
Abstract :
In this paper, we show that domain-optimized text detection in biomedical images is important for boosting text extraction recall via off-the-shelf OCR engines. Methodologically, we contrast OCR performance when processing raw biomedical images, compared to preprocessing those images, and performing OCR on detected image text regions only. To quantify OCR extraction results, we rely on a gold standard image text corpus with manually identified image text strings. To demonstrate the positive effect on biomedical image retrieval, we apply image text detection and extraction to a large corpus of biomedical images in the Yale Image Finder system. We show that improved text extraction results in the retrieval of a larger number of relevant images for a set of domain-relevant keyword searches.
Keywords :
feature extraction; medical image processing; optical character recognition; text analysis; Yale Image Finder system; biomedical image retrieval; domain-relevant keyword searches; image text string; off-the-shelf OCR engines; text extraction; text region detection; Biomedical imaging; Engines; Gold; Guidelines; Image retrieval; Optical character recognition software; Videos;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Biomedical Sciences and Engineering Conference (BSEC), 2011
Conference_Location :
Knoxville, TN
Print_ISBN :
978-1-61284-411-4
Electronic_ISBN :
978-1-61284-410-7
Type :
conf
DOI :
10.1109/BSEC.2011.5872319
Filename :
5872319
Link To Document :
بازگشت