DocumentCode :
669175
Title :
Ridgelet-DTW-based word spotting for Arabic historical document
Author :
Brik, Youcef ; Chibani, Youcef ; Zemouri, Et-Tahir ; Sehad, Abdenour
Author_Institution :
Speech Commun. & Signal Process. Lab., Univ. of Sci. & Technol. Houari Boumediene, Algiers, Algeria
fYear :
2013
fDate :
4-6 Sept. 2013
Firstpage :
194
Lastpage :
199
Abstract :
In this paper we propose a system for word spotting in Arabic historical document using Ridgelet transform and Dynamic Time Warping (DTW). First, a preprocessing and segmentation processes are applied to all document pages to create a word image dataset. Keeping each word into its original size, Ridgelet descriptor is generated without applying the normalization criteria for Radon transform, where the rotation, translation and scaling invariance is achieved. Therefore, DTW algorithm is employed to match corresponding projection angle pairs from Ridgelet descriptor, while avoiding problems associated with dimensionality reduction of descriptor sets into one vector which cause a loss of useful information. Experiments were conducted on historical Arabic document from the National library. The obtained results showed the effectiveness of the proposed method.
Keywords :
Radon transforms; document image processing; image segmentation; natural language processing; text analysis; Arabic historical document; DTW algorithm; Radon transform; descriptor sets; dimensionality reduction; document pages preprocessing; document pages segmentation processes; dynamic time warping; ridgelet descriptor; ridgelet transform; ridgelet-DTW-based word spotting; scaling invariance; word image dataset; Frequency modulation; Shape; Signal processing algorithms; Vectors; Wavelet transforms; arabic historical document; dynamic time warping (DTW); ridgelet transform; word spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image and Signal Processing and Analysis (ISPA), 2013 8th International Symposium on
Conference_Location :
Trieste
Type :
conf
DOI :
10.1109/ISPA.2013.6703738
Filename :
6703738
Link To Document :
بازگشت