• DocumentCode
    2476040
  • Title

    Compression enhancement of video motion of mouth region using joint audio and video coding

  • Author

    Mujal, Miquel ; Kirlin, R. Lynn

  • Author_Institution
    Dept. de Teoria del Senyal i Comunicacions, Univ. Politecnica de Catalunya, Barcelona, Spain
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    82
  • Lastpage
    86
  • Abstract
    We propose an application that utilises audio and video data dependencies to achieve additional video compression in low-bit rate encoding systems such as: H.263+ video coding and G.723.1 audio coding standards. The joint correlation of synchronized audio and motion parameters has been proved to exist. A joint performance of principal component analysis (PCA) by Karhunen-Loeve expansions (KL) and tree-structured vector quantization algorithms (TSVQ) based on Linde-Buzo-Gray (LBG) and competitive learning (CL) techniques achieve as much as 60% bit reduction for the motion in the mouth region (1% of the overall output bit rate of a P frame) and provide the same motion-compensated image quality in high picture formats. We show performance evaluations that determine the optimal audio parameters, such as linear predictive coefficients (LPC) or line spectrum pairs (LSP), and determine the nature of the motion parameter in each macroblock of the mouth region when using advanced prediction mode (APM) video coding
  • Keywords
    Karhunen-Loeve transforms; audio coding; code standards; data compression; linear predictive coding; motion compensation; principal component analysis; speech intelligibility; synchronisation; tree data structures; unsupervised learning; vector quantisation; video coding; APM video coding; G.723.1; H.263; Karhunen-Loeve expansions; LPC; LSP; Linde-Buzo-Gray techniques; PCA; TSVQ; advanced prediction mode; audio coding; coding standards; competitive learning; compression enhancement; joint audio video coding; line spectrum pairs; linear predictive coefficients; low-bit rate encoding; macroblock; motion compensated image quality; mouth region; performance evaluations; principal component analysis; speech intelligibility; synchronization; tree-structured vector quantization algorithms; video compression; Audio coding; Bit rate; Encoding; Motion analysis; Mouth; Principal component analysis; Region 1; Vector quantization; Video coding; Video compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Analysis and Interpretation, 2002. Proceedings. Fifth IEEE Southwest Symposium on
  • Conference_Location
    Sante Fe, NM
  • Print_ISBN
    0-7695-1537-1
  • Type

    conf

  • DOI
    10.1109/IAI.2002.999894
  • Filename
    999894