• DocumentCode
    3720773
  • Title

    Language independent rule based classification of printed & handwritten text

  • Author

    Tanzila Saba;Abdulaziz S. Almazyad;Amjad Rehman

  • Author_Institution
    College of Computer and Information Sciences, Prince Sultan University Riyadh, KSA
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Handwriting in data entry forms/documents usually indicates user´s filled information that should be treated differently from the printed text. In Arab world, these filled information are normally in English or Arabic. Secondly, classification approaches are quite different for machine printed and script. Therefore, prior to segmentation & classification, text distinction into Printed & script entries is mandatory. In this research, the dilemma of the language independent text distinction in multilingual data entry forms is addressed. Our main focus is to distinguish the machine printed text and script in multilingual data entry forms that are language independent. The proposed approach explore new statistical and structural features of text lines to classify them into separate categories. Accordingly a set of classification rules is derived to explicitly differentiate machine printed and handwritten entries, written in any language. Additional, novelty of the proposed approach is that no training/training data is required rather text is discriminated on basis of simple rules. Promising experimental results with 90 % accuracy exhibit that proposed approach is simple and robust. Finally, the scheme is independent of language, style, size, and fonts that commonly co-exist in multilingual data entry forms.
  • Keywords
    "Artificial intelligence","Robustness","Artificial neural networks","Character recognition","Image recognition"
  • Publisher
    ieee
  • Conference_Titel
    Evolving and Adaptive Intelligent Systems (EAIS), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/EAIS.2015.7368806
  • Filename
    7368806