• DocumentCode
    2196103
  • Title

    Recognition of Chinese business cards

  • Author

    Chiou, Yaw-Huei ; Lee, Hsi-Jian

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
  • Volume
    2
  • fYear
    1997
  • fDate
    18-20 Aug 1997
  • Firstpage
    1028
  • Abstract
    Business cards include many kinds of information, such as names, addresses and telephone numbers. In order to use the information effectively, it is necessary to extract the information from the cards automatically in order to build a database. The goal of this paper is to extract and recognize characters from color business cards. To separate the foreground from the background in a card, we assign all pixels to eight color types. Then we calculate a dynamic threshold using the color information to extract the foreground. Next, we extract the characters by four steps: (i) connected component extraction, (ii) local thresholding, (iii) mark, line and noise deletion, and (iv) character grouping. Finally, we recognize the characters by a statistical Chinese and English character recognition system. We test 30 business cards which have Chinese characters, English characters, numerals and punctuation marks. The extraction rate and accuracy for our system are 96.97% and 95.43% respectively. The recognition rate is 88.78% for Chinese characters and 97.58% for English characters, numerals and punctuation marks
  • Keywords
    business forms; document image processing; feature extraction; image colour analysis; image segmentation; optical character recognition; Chinese business cards; Chinese characters; English characters; accuracy; character extraction; character grouping; color business cards; connected component extraction; database; dynamic threshold; extraction rate; foreground-background separation; information extraction; line deletion; local thresholding; mark deletion; noise deletion; numerals; pixel color types; punctuation marks; recognition rate; statistical character recognition system; Character recognition; Color; Colored noise; Computer science; Data mining; Databases; Neural networks; System testing; Telephony; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
  • Conference_Location
    Ulm
  • Print_ISBN
    0-8186-7898-4
  • Type

    conf

  • DOI
    10.1109/ICDAR.1997.620665
  • Filename
    620665