• DocumentCode
    3171006
  • Title

    An OCR based translation system between simplified and complex Chinese characters

  • Author

    Shyu, Keh-Hwa ; Lee, Chun-Jen ; Mu-King Tsay

  • Author_Institution
    Inst. of Comput. Sci. & Electron. Eng., Nat. Central Univ., Chung-Li, Taiwan
  • Volume
    2
  • fYear
    1994
  • fDate
    9-13 Oct 1994
  • Firstpage
    368
  • Abstract
    A new automatic translation system between simplified and complex Chinese characters based on OCR approaches is proposed in this paper. This system can demonstrate an efficient feature extraction algorithm for recognizing either complex or simplified printed Chinese characters. In addition, a new post-processing model proposed in the authors´ system not only translates texts between complex and simplified characters, but also corrects character recognition errors. Experimental results show that the average recognition rates are about 99.2% and 95.3% for single font and multi-font recognition respectively. In testing on real documents of simplified characters, it achieves a recognition rate of 96.2% without contextual post-processing. Using the proposed language model for post-processing, one can improve the final accuracy rate to 97.8% including the text translation process and the recognition error correction
  • Keywords
    optical character recognition; Chinese characters; OCR based translation system; accuracy rate; feature extraction algorithm; language model; multi-font recognition; post-processing; recognition error correction; single font recognition; text translation process; Character recognition; Data mining; Feature extraction; Image converters; Laser modes; Lattices; Optical character recognition software; Pattern matching; Pixel; Tin;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1994. Vol. 2 - Conference B: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on
  • Conference_Location
    Jerusalem
  • Print_ISBN
    0-8186-6270-0
  • Type

    conf

  • DOI
    10.1109/ICPR.1994.576940
  • Filename
    576940