مرکز منطقه ای اطلاع رساني علوم و فناوري - Text-line extraction and character recognition of Japanese newspaper headlines with graphical designs

DocumentCode :

2396351

Title :

Text-line extraction and character recognition of Japanese newspaper headlines with graphical designs

Author :

Sawaki, Minako ; Hagita, Norihiro

Author_Institution :

NTT Basic Res. Labs., Kanagawa, Japan

Volume :

fYear :

1996

fDate :

25-29 Aug 1996

Firstpage :

Abstract :

The conventional OCR fails to recognize most characters in Japanese newspaper headlines with graphical designs because of the difficulty of removing the designs. This paper proposes a method that recognizes such characters without removing the designs. First, text-line regions are extracted from a local distribution of the combination of black and white runs observed in a rectangular window while the window is shifted pixel-by-pixel in the direction of the text-line. Characters in the extracted text-line region are then recognized by displacement matching. Adaptive thresholding against the degree of degradation suppresses spurious candidates yielded by displacement matching even with graphical designs. Experimental results for fifty Japanese newspaper headlines show that the method achieves a recognition rate of 97.7%, much higher than a conventional method (17.0%)

Keywords :

document image processing; image segmentation; optical character recognition; Japanese newspaper headlines; adaptive thresholding; black and white runs; character recognition; degree of degradation; displacement matching; graphical designs; recognition rate; text-line extraction; Character recognition; Degradation; Design methodology; Image databases; Laboratories; Optical character recognition software; Optical devices; Pixel; Robustness; Software libraries;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Pattern Recognition, 1996., Proceedings of the 13th International Conference on

Conference_Location :

Vienna

ISSN :

1051-4651

Print_ISBN :

0-8186-7282-X

Type :

conf

DOI :

10.1109/ICPR.1996.546797

Filename :

546797

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2396351