Abstract :
Interest in optical character recognition has grown tremendously during the past several years. When one considers the volume of printed information that must be accessible both to human readers and to data processing machines, the increasing interest is not at all surprising. Many optical character recognition principles that appear to be quite powerful are impractical or too costly to implement with available technology. Other principles that appear relatively economical require an input quality that is impractical to generate. The field for commercial development falls somewhere between these two extremes. Many factors determine the character recognition principles to be utilized in the development of a machine system. Of primary importance are: 1) The shapes of the symbols to be sensed, 2) The number of different symbols to be discriminated, 3) The print quality range that must be accommodated, and 4) The required performance of the system. Unfortunately, in the present state of the art, the relationship of these factors to each other and to a given principle can be only qualitatively defined. This paper describes and discusses the development of a practical optical character recognition system, the character recognition portion of the IBM 1418 Optical Character Reader. The considerations involved are applicable to the development of any character recognition system.