Title :
Recognition of Chinese business cards
Author :
Chiou, Yaw-Huei ; Lee, Hsi-Jian
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
Business cards include many kinds of information, such as names, addresses and telephone numbers. In order to use the information effectively, it is necessary to extract the information from the cards automatically in order to build a database. The goal of this paper is to extract and recognize characters from color business cards. To separate the foreground from the background in a card, we assign all pixels to eight color types. Then we calculate a dynamic threshold using the color information to extract the foreground. Next, we extract the characters by four steps: (i) connected component extraction, (ii) local thresholding, (iii) mark, line and noise deletion, and (iv) character grouping. Finally, we recognize the characters by a statistical Chinese and English character recognition system. We test 30 business cards which have Chinese characters, English characters, numerals and punctuation marks. The extraction rate and accuracy for our system are 96.97% and 95.43% respectively. The recognition rate is 88.78% for Chinese characters and 97.58% for English characters, numerals and punctuation marks
Keywords :
business forms; document image processing; feature extraction; image colour analysis; image segmentation; optical character recognition; Chinese business cards; Chinese characters; English characters; accuracy; character extraction; character grouping; color business cards; connected component extraction; database; dynamic threshold; extraction rate; foreground-background separation; information extraction; line deletion; local thresholding; mark deletion; noise deletion; numerals; pixel color types; punctuation marks; recognition rate; statistical character recognition system; Character recognition; Color; Colored noise; Computer science; Data mining; Databases; Neural networks; System testing; Telephony; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
DOI :
10.1109/ICDAR.1997.620665