• DocumentCode
    597978
  • Title

    HEVC-based scanned document compression

  • Author

    Zaghetto, A. ; Macchiavello, B. ; de Queiroz, R.L.

  • Author_Institution
    Dept. of Comput. Sci., Univ. de Brasilia, Brasilia, Brazil
  • fYear
    2012
  • fDate
    Sept. 30 2012-Oct. 3 2012
  • Firstpage
    821
  • Lastpage
    824
  • Abstract
    This paper proposes a hybrid pattern matching/transform-based compression engine for scanned compound documents. The novelty of this approach is demonstrated by using a modified version of the HEVC (High Efficiency Video Coding) Test Model as a compound document compressor, here conveniently referred to as HEDC (High Efficiency Document Coder). The proposed method uses segments of a document to create a video sequence, which is then encoded by HEDC. The idea is to explore interframe prediction as a pattern matching algorithm for coding units pre-classified as text; and intraframe prediction for coding units pre-classified as image. Results show that HEDC outperforms AVC-I, HEVC-I (H.264/AVC and HEVC operating in pure intra mode), H.264/AVC and JPEG2000 by up to 3.3, 2.5, 1.7 and 5 dB, respectively. Furthermore, for most documents the proposed method yields practically the same rate-distortion performance as regular HEVC, but is approximately 5% to 20% faster due to a pre-classification algorithm that prevents it of performing all possible inter/intra prediction tests for each prediction unit.
  • Keywords
    data compression; document image processing; image sequences; pattern matching; rate distortion theory; transforms; video coding; AVC-I; H.264-AVC; HEDC encoding; HEVC test model; HEVC-I; HEVC-based scanned document compression; JPEG2000; coding units; document segments; high efficiency document coder; high efficiency video coding; image; interframe prediction; intraframe prediction; pattern matching algorithm; preclassification algorithm; rate-distortion performance; scanned compound document; text; transform-based compression engine; video sequence; Compounds; Image coding; PSNR; Pattern matching; Prediction algorithms; Transform coding; Video coding; High Efficiency Video Coding; Page processing; compound document compression; pattern matching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2012 19th IEEE International Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4673-2534-9
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2012.6466986
  • Filename
    6466986