• DocumentCode
    153301
  • Title

    Business Forms Classification Using Earth Mover´s Distance

  • Author

    Bukhari, Syed Saqib ; Ebbecke, Markus ; Gillmann, Michael

  • Author_Institution
    Insiders Technol. GmbH, Kaiserslautern, Germany
  • fYear
    2014
  • fDate
    7-10 April 2014
  • Firstpage
    11
  • Lastpage
    15
  • Abstract
    Form Classification has not been focused on for the last decade. Unfortunately the algorithms published mainly in the 80s and 90s do not meet the requirements in our present commercial document analysis projects. There we are confronted with conditions and requirements unanticipated by that research, such as fax distortions and - even worse - form variations. In this work we introduce a new color-coded pixel-based form classification method using Earth Mover´s Distance (EMD) that is robust against fax distortions and content variations. Experimental results prove the effectiveness of the presented method. It achieved more than 90% classification accuracy on a real-world business forms dataset, which is significantly better than the competing state-of-the-art methods.
  • Keywords
    business forms; document image processing; image colour analysis; statistical distributions; EMD; business form classification; color-coded pixel; earth mover´s distance; fax distortion; form classification method; form variation; Business; Earth; Facsimile; Image coding; Image color analysis; Image segmentation; Text analysis; Business Forms; Document Retrieval; Forms Classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
  • Conference_Location
    Tours
  • Print_ISBN
    978-1-4799-3243-6
  • Type

    conf

  • DOI
    10.1109/DAS.2014.59
  • Filename
    6830960