Title :
Kannada text line extraction based on energy minimization and skew correction
Author :
Dixit, Sudhaker ; Narayan, Suresh Hosahalli ; Belur, Mahesh
Author_Institution :
Inf. Sci. & Eng. Dept., Dayananda Sagar Coll. of Eng., Bangalore, India
Abstract :
There are many governmental, cultural, commercial and educational organizations that manage large number of manuscript textual information. Kannada being one of the official languages of South India, such organizations include Kannada handwritten documents. Text line segmentation in such documents remains an open document analysis problem. Detection and correction of skew angle of the segmented text lines become another important step in document analysis. Most of the segmentation algorithms, for skewed text lines, present in the literature today are sensitive to the degree of skew, direction of skew, and spacing between adjacent lines. In this paper, proposed method for the text line extraction and skew correction of the extracted text lines uses a new cost function, which considers the spacing between text lines and the skew of each text line is used. Precisely, the problem is formulated as an energy minimization problem so that the minimization of the cost function yields a set of text lines. Further it is required to efficiently correct baseline skew and fluctuations of these text lines. This proposed method also uses an efficient algorithm for baseline correction. It consists of normalizing the lower baseline to a horizontal line using a skating window approaches, thus, avoiding the segmentation of text lines into subparts. This approach copes with baselines which are skewed, fluctuating, or both. It differs from machine learning approaches which need manual pixel assignments to baselines. Experimental results show that this baseline correction approach highly improves performance.
Keywords :
minimisation; text analysis; Kannada handwritten documents; Kannada text line extraction; baseline correction; educational organizations; energy minimization problem; horizontal line; machine learning; manuscript textual information; open document analysis problem; pixel assignments; segmentation algorithms; segmented text lines; skating window; skew angle; skew correction; skewed text lines; text line segmentation; Conferences; Decision support systems; Handheld computers; Document analysis; baseline skew and fluctuations; cost function; energy minimization; skating window approach; skew angle; skew detection and correction;
Conference_Titel :
Advance Computing Conference (IACC), 2014 IEEE International
Conference_Location :
Gurgaon
Print_ISBN :
978-1-4799-2571-1
DOI :
10.1109/IAdCC.2014.6779295