• DocumentCode
    3019229
  • Title

    A segmentation-free approach for keyword search in historical typewritten documents

  • Author

    Gatos, B. ; Konidaris, T. ; Ntzios, K. ; Pratikakis, I. ; Perantonis, S.J.

  • Author_Institution
    Computational Intelligence Lab., Inst. of Informatics & Telecommun., Athens, Greece
  • fYear
    2005
  • fDate
    29 Aug.-1 Sept. 2005
  • Firstpage
    54
  • Abstract
    In this paper, we propose a novel segmentation-free approach for keyword search in historical typewritten documents combining image preprocessing, synthetic data creation, word spotting and user feedback technologies. Our aim is to search for keywords typed by the user in a large collection of digitized typewritten historical documents. The proposed method is based on: (i) image preprocessing for image binarization and enhancement, noisy border and frame removal, orientation and skew correction; (ii) creation of synthetic image words from keywords typed by the user; (Hi) word segmentation using dynamic parameters; (iv) efficient feature extraction for each image word and (v) a retrieval procedure that is optimized by user´s feedback. Experimental results prove the efficiency of the proposed approach.
  • Keywords
    document image processing; optical character recognition; historical typewritten documents; image binarization; image preprocessing; keyword search; segmentation-free approach; synthetic data creation; user feedback technology; word spotting; Computational intelligence; Feature extraction; Feedback; Image retrieval; Image segmentation; Indexing; Informatics; Keyword search; Laboratories; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
  • ISSN
    1520-5263
  • Print_ISBN
    0-7695-2420-6
  • Type

    conf

  • DOI
    10.1109/ICDAR.2005.30
  • Filename
    1575509