• DocumentCode
    2417375
  • Title

    Business form classification using strings

  • Author

    Ting, Antoine ; Leung, Maylor K. H.

  • Author_Institution
    Sch. of Appl. Sci., Nanyang Technol. Inst., Singapore
  • Volume
    2
  • fYear
    1996
  • fDate
    25-29 Aug 1996
  • Firstpage
    690
  • Abstract
    Business forms are “linear” documents which can be accurately described by a one-dimensional data structure. This paper proposes a novel approach for form identification using strings. This application can be used as a basis for extension to other “linear” documents such as logos or line drawings. A set of known blank forms is stored in a database and incoming forms are automatically matched to one of these. In addition, forms which are not in the database can also be detected. A novel and simple method is used for matching by considering a distinctive “signature” for each document. This takes the shape of a string which describes the elements present on the form. Included are the location and size of lines, corners and blocks of text, quantised as discrete symbols. A specially adapted and efficient string edit distance calculation is then applied for matching. Unregistered forms can be detected by examining the unmatched elements between two strings. This novel string format makes it possible to extend the conventional one-dimensional representation possibilities of strings to a richer “one-and-a-half dimensional” structure and requires no training
  • Keywords
    document image processing; image classification; image segmentation; visual databases; business form classification; discrete symbols; form identification; line drawings; linear documents; logos; one-dimensional data structure; string edit distance calculation; strings; Application software; Data structures; Dynamic programming; Neural networks; Pattern matching; Pattern recognition; Shape; Software systems; Spatial databases; Trademarks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1996., Proceedings of the 13th International Conference on
  • Conference_Location
    Vienna
  • ISSN
    1051-4651
  • Print_ISBN
    0-8186-7282-X
  • Type

    conf

  • DOI
    10.1109/ICPR.1996.546911
  • Filename
    546911