• DocumentCode
    703131
  • Title

    Locating text in color document images

  • Author

    Ortacag, Erel ; Sankur, Bulent ; Sayood, Khalid

  • Author_Institution
    Dept. of Electr. & Electron. Eng., Bogazici Univ., Istanbul, Turkey
  • fYear
    1998
  • fDate
    8-11 Sept. 1998
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    A novel text extraction algorithm from cluttered color document images is developed and tested. The algorithm consists of a color segmentation stage followed by rule-based filtering of non-text regions. Extraction of text segments algorithm uses the measurement of geometrical properties as well as characterness properties and a set of heuristic rules. The algorithm includes a fusion cycle of three different segmentation maps, and a restitution cycle to restore any deleted characters and/or their diacritical marks. The proposed method, proven successful in extraction of texts from many color document images, has applications in color image indexing and retrieval.
  • Keywords
    document image processing; image colour analysis; image filtering; image retrieval; image segmentation; text analysis; cluttered color document images; color image indexing; color segmentation; heuristic rules; image retrieval; nontext regions; rule-based filtering; text location; text segment extraction; Clustering algorithms; Color; Image color analysis; Image restoration; Image segmentation; Octrees; Quantization (signal);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO 1998), 9th European
  • Conference_Location
    Rhodes
  • Print_ISBN
    978-960-7620-06-4
  • Type

    conf

  • Filename
    7089601