• Title of article

    Automatically Deciding if a Document was Scanned or Photographed

  • Author/Authors

    e Silva, Gabriel Pereira Federal University of Pernambuco, Brazil , Lins, Rafael Dueire Federal University of Pernambuco, Brazil , Miro, Brenno Federal University of Pernambuco, Brazil , Simske, Steven J. HP Labs, USA , Thielo, Marcelo HP Labs, Brazil

  • From page
    3364
  • To page
    3375
  • Abstract
    Portable digital cameras are being used widely by students and professionals in different fields as a practical way to digitize documents. Tools such as PhotoDoc enable the batch processing of such documents, performing automatic border removal and perspective correction. A PhotoDoc processed document and a scanned one look very similar to the human eye if both are in true color. However, if one tries to automatically binarize a batch of documents digitized from portable cameras compared to scanners, they have different features. The knowledge of their source is fundamental for successful processing. This paper presents a classification strategy to distinguish between scanned and photographed documents. Over 16,000 documents were tested with a correct classification rate of over 99.96%.
  • Keywords
    Keywords: MPEG , 7 , content , based Multimedia Retrieval , Hypermedia systems , Web , based services , XML , Semantic Web , Multimedia
  • Journal title
    Journal of J.UCS (Journal of Universal Computer Science)
  • Journal title
    Journal of J.UCS (Journal of Universal Computer Science)
  • Record number

    2661594