DocumentCode
153301
Title
Business Forms Classification Using Earth Mover´s Distance
Author
Bukhari, Syed Saqib ; Ebbecke, Markus ; Gillmann, Michael
Author_Institution
Insiders Technol. GmbH, Kaiserslautern, Germany
fYear
2014
fDate
7-10 April 2014
Firstpage
11
Lastpage
15
Abstract
Form Classification has not been focused on for the last decade. Unfortunately the algorithms published mainly in the 80s and 90s do not meet the requirements in our present commercial document analysis projects. There we are confronted with conditions and requirements unanticipated by that research, such as fax distortions and - even worse - form variations. In this work we introduce a new color-coded pixel-based form classification method using Earth Mover´s Distance (EMD) that is robust against fax distortions and content variations. Experimental results prove the effectiveness of the presented method. It achieved more than 90% classification accuracy on a real-world business forms dataset, which is significantly better than the competing state-of-the-art methods.
Keywords
business forms; document image processing; image colour analysis; statistical distributions; EMD; business form classification; color-coded pixel; earth mover´s distance; fax distortion; form classification method; form variation; Business; Earth; Facsimile; Image coding; Image color analysis; Image segmentation; Text analysis; Business Forms; Document Retrieval; Forms Classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
Conference_Location
Tours
Print_ISBN
978-1-4799-3243-6
Type
conf
DOI
10.1109/DAS.2014.59
Filename
6830960
Link To Document