Title :
Local Binary Patterns for Arabic Optical Font Recognition
Author :
Nicolaou, Anguelos ; Slimane, Fouad ; Maergner, Volker ; Liwicki, Marcus
Author_Institution :
Image & Voice Anal. (DIVA) Group, Univ. of Fribourg, Fribourg, Switzerland
Abstract :
Optical Font Recognition (OFR) has been proven to increase Optical Character Recognition (OCR) accuracy, but it can also help in harvesting semantic information from documents. It therefore becomes a part of many Document Image Analysis (DIA) pipelines. Our work is based on the hypothesis that Local Binary Patterns (LBP), as a generic texture classification method, can address several distinct DIA problems at the same time such as OFR, script detection, writer identification, etc. In this paper we strip down the Redundant Oriented LBP (RO-LBP) method, previously used in writer identification, and apply it for OFR with the goal of introducing a generic method that classifies text as oriented texture. We focus on Arabic OFR and try to perform a thorough comparison of our method and the leading Gaussian Mixture Model method that is developed specifically for the task. Depending on the nature of proposed OFR method, each method´s performance is usually evaluated on different data and with different evaluation protocols. The proposed experimental procedure addresses this problem and allows us to compare OFR methods that are fundamentally different by adapting them to a common measurement protocol. In performed experiments LBP method achieves perfect results on large text blocks generated from the APTI database, while preserving its very broad generic attributes as proven by secondary experiments.
Keywords :
Gaussian processes; document image processing; image classification; image texture; mixture models; optical character recognition; Arabic OFR; Arabic optical font recognition; DIA; Gaussian mixture model method; OCR; RO-LBP method; document image analysis; evaluation protocols; local binary patterns; optical character recognition; redundant oriented LBP method; script detection; semantic information; text classification; texture classification method; writer identification; Databases; Feature extraction; Histograms; Optical character recognition software; Text recognition; Training; Arabic; GMM; LBP; OFR; script; texture;
Conference_Titel :
Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
Conference_Location :
Tours
Print_ISBN :
978-1-4799-3243-6
DOI :
10.1109/DAS.2014.71