Title :
Scale and rotation invariant recognition of cursive Pashto script using SIFT features
Author :
Ahmad, Riaz ; Amin, Saman Hameed ; Khan, Mohammad A U
Author_Institution :
Dept. of Comput. Sci., FAST-NUCES, Peshawar, Pakistan
Abstract :
Cursive scripts such as Urdu, Pashto and Arabic contain large number of unique shapes called ligatures. Recognition of thousands of ligatures is challenging due to variations of various kinds including scaling, orientation, font style, spatial location/registration of ligatures and limited number of samples available for training. Accurate segmentation is a key challenge for analytic approaches, whereas holistic approaches suffer due to limitations of various feature representation schemes. In this paper, the use of SIFT descriptor has been proposed to evaluate its effectiveness for representing Pashto ligatures while overcoming above mentioned challenges in a holistic framework. The proposed approach is script independent and can be easily adapted to other cursive languages. A comparison of recognition results against classical methods such as PCA is provided to test the effectiveness of feature representation. Our research shows that SIFT descriptor perform better than classical feature representation methods such as PCA. The proposed recognition is holistic using ligature (word) based classification. We have tested 1000 unique ligatures (images) with 4 different sizes, along with their rotated images; and average recognition rate that is obtained is 74%.
Keywords :
optical character recognition; PCA; SIFT descriptor; cursive Pashto script; holistic recognition; image recognition; ligatures; rotation invariant recognition; Character recognition; Feature extraction; Image segmentation; Optical character recognition software; Principal component analysis; Shape; Training;
Conference_Titel :
Emerging Technologies (ICET), 2010 6th International Conference on
Conference_Location :
Islamabad
Print_ISBN :
978-1-4244-8057-9
DOI :
10.1109/ICET.2010.5638470