• DocumentCode
    120492
  • Title

    Kannada text line extraction based on energy minimization and skew correction

  • Author

    Dixit, Sudhaker ; Narayan, Suresh Hosahalli ; Belur, Mahesh

  • Author_Institution
    Inf. Sci. & Eng. Dept., Dayananda Sagar Coll. of Eng., Bangalore, India
  • fYear
    2014
  • fDate
    21-22 Feb. 2014
  • Firstpage
    62
  • Lastpage
    67
  • Abstract
    There are many governmental, cultural, commercial and educational organizations that manage large number of manuscript textual information. Kannada being one of the official languages of South India, such organizations include Kannada handwritten documents. Text line segmentation in such documents remains an open document analysis problem. Detection and correction of skew angle of the segmented text lines become another important step in document analysis. Most of the segmentation algorithms, for skewed text lines, present in the literature today are sensitive to the degree of skew, direction of skew, and spacing between adjacent lines. In this paper, proposed method for the text line extraction and skew correction of the extracted text lines uses a new cost function, which considers the spacing between text lines and the skew of each text line is used. Precisely, the problem is formulated as an energy minimization problem so that the minimization of the cost function yields a set of text lines. Further it is required to efficiently correct baseline skew and fluctuations of these text lines. This proposed method also uses an efficient algorithm for baseline correction. It consists of normalizing the lower baseline to a horizontal line using a skating window approaches, thus, avoiding the segmentation of text lines into subparts. This approach copes with baselines which are skewed, fluctuating, or both. It differs from machine learning approaches which need manual pixel assignments to baselines. Experimental results show that this baseline correction approach highly improves performance.
  • Keywords
    minimisation; text analysis; Kannada handwritten documents; Kannada text line extraction; baseline correction; educational organizations; energy minimization problem; horizontal line; machine learning; manuscript textual information; open document analysis problem; pixel assignments; segmentation algorithms; segmented text lines; skating window; skew angle; skew correction; skewed text lines; text line segmentation; Conferences; Decision support systems; Handheld computers; Document analysis; baseline skew and fluctuations; cost function; energy minimization; skating window approach; skew angle; skew detection and correction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advance Computing Conference (IACC), 2014 IEEE International
  • Conference_Location
    Gurgaon
  • Print_ISBN
    978-1-4799-2571-1
  • Type

    conf

  • DOI
    10.1109/IAdCC.2014.6779295
  • Filename
    6779295