Title :
Indexing of ancient document images based on EM algorithm and tangent distance
Author :
Mahjoub, Mohamed Ali ; Jayech, Khaoula
Author_Institution :
Res. Unit SAGE, Nat. Eng. Sch. of Sousse, Sousse, Tunisia
Abstract :
In this paper we present a method of indexing of images from ancient documents based on Bayesian density estimation by the EM algorithm and tangent distance. Initially we present the procedure in case of known density of the mixture to discuss how to spend the density classification and therefore indexing. Once we cleared the problem justifies the choice of the density approximation by Gaussian mixture. Then we present the indexing algorithm based on the EM algorithm and the tangent distance. The tangent distance is a mathematical tool that compares two patterns (or images) by taking into account small transformations such as rotation and homothety, phenomena often encountered in ancient documents. The results show the robustness of the method compared to small global transformations.
Keywords :
Bayes methods; approximation theory; document image processing; expectation-maximisation algorithm; indexing; Bayesian density estimation; EM algorithm; Gaussian mixture model; ancient document image indexing method; density approximation; density classification; expectation-maximization algorithm; tangent distance; Classification algorithms; Estimation; Histograms; Image color analysis; Indexing; Prototypes; Vectors; formatting; insert; style; styling;
Conference_Titel :
Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), 2012 6th International Conference on
Conference_Location :
Sousse
Print_ISBN :
978-1-4673-1657-6
DOI :
10.1109/SETIT.2012.6481960