• DocumentCode
    3325266
  • Title

    Optimal techniques in OCR error correction for Japanese texts

  • Author

    Hisamitsu, Toru ; Marukawa, Katsumi ; Shima, Yoshihiro ; Fujisawa, Hiromichi ; Nitta, Yoshihiko

  • Author_Institution
    Adv. Res. Lab., Hitachi Ltd., Saitama, Japan
  • Volume
    2
  • fYear
    1995
  • fDate
    14-16 Aug 1995
  • Firstpage
    1014
  • Abstract
    This paper investigates three fundamental techniques in OCR error correction for Japanese texts using morphological analysis: (1) an optimal method for candidate word extraction from a candidate character lattice, (2) optimal word entries for Japanese verb inflection analysis, and (3) a new method of word matching cost calculation which is more suitable to be used with linguistic criteria. Comparative evaluation shows that the combination of these techniques requires 84% less computation, captures 2.6% more candidate words, reduces the chart parsing computation by 20%, and attains 25% higher error correction rate than a commonly used method
  • Keywords
    error correction; grammars; optical character recognition; Japanese texts; Japanese verb inflection analysis; OCR error correction; candidate character lattice; candidate word extraction; chart parsing computation; linguistic criteria; morphological analysis; optimal method; optimal techniques; optimal word entries; word matching cost calculation; Character generation; Cost function; Dictionaries; Distributed decision making; Error analysis; Error correction; Laboratories; Lattices; Optical character recognition software; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    0-8186-7128-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.1995.602074
  • Filename
    602074