• DocumentCode
    1492883
  • Title

    Automatic script identification from document images using cluster-based templates

  • Author

    Hochberg, Judith ; Kelly, Patrick ; Thomas, Timothy ; Kerns, Lila

  • Author_Institution
    Los Alamos Nat. Lab., NM, USA
  • Volume
    19
  • Issue
    2
  • fYear
    1997
  • fDate
    2/1/1997 12:00:00 AM
  • Firstpage
    176
  • Lastpage
    181
  • Abstract
    We describe an automated script identification system for typeset document images. Templates for each script are created by clustering textual symbols from a training set. Symbols from new images are compared to the templates to find the best script. Our current system processes thirteen scripts with minimal preprocessing and high accuracy
  • Keywords
    optical character recognition; automatic script identification; cluster-based templates; document images; textual symbol clustering; typeset document images; Character recognition; Image analysis; Indexing; Laboratories; Natural languages; Optical character recognition software; Postal services; Shape; Text analysis; Typesetting;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/34.574802
  • Filename
    574802