• DocumentCode
    3059999
  • Title

    World image matching as a technique for degraded text recognition

  • Author

    Hull, Jonathan J. ; Khoubyari, Siamak ; Ho, Tin Kam

  • Author_Institution
    Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA
  • fYear
    1992
  • fDate
    30 Aug-3 Sep 1992
  • Firstpage
    665
  • Lastpage
    668
  • Abstract
    A technique is presented that determines equivalences between word images in a passage of text. A clustering procedure is applied to group visually similar words. Initial hypotheses for the identities of words are then generated by matching the word groups to language statistics that predict the frequency at which certain words will occur. This is followed by a recognition step that assigns identifications to the images in the clusters. This paper concentrates on the clustering algorithm. A clustering technique is presented and its performance on a running text of 1062 word images is determined. It is shown that the clustering algorithm can correctly locate groups of short function words with better than a 95 percent correct rate
  • Keywords
    document image processing; image recognition; clustering algorithm; clustering procedure; degraded text recognition; equivalences; language statistics; recognition step; short function words; word groups; word images; Character recognition; Clustering algorithms; Degradation; Dictionaries; Error correction; Image matching; Image recognition; Natural languages; Statistics; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
  • Conference_Location
    The Hague
  • Print_ISBN
    0-8186-2915-0
  • Type

    conf

  • DOI
    10.1109/ICPR.1992.201864
  • Filename
    201864