DocumentCode
2476040
Title
Compression enhancement of video motion of mouth region using joint audio and video coding
Author
Mujal, Miquel ; Kirlin, R. Lynn
Author_Institution
Dept. de Teoria del Senyal i Comunicacions, Univ. Politecnica de Catalunya, Barcelona, Spain
fYear
2002
fDate
2002
Firstpage
82
Lastpage
86
Abstract
We propose an application that utilises audio and video data dependencies to achieve additional video compression in low-bit rate encoding systems such as: H.263+ video coding and G.723.1 audio coding standards. The joint correlation of synchronized audio and motion parameters has been proved to exist. A joint performance of principal component analysis (PCA) by Karhunen-Loeve expansions (KL) and tree-structured vector quantization algorithms (TSVQ) based on Linde-Buzo-Gray (LBG) and competitive learning (CL) techniques achieve as much as 60% bit reduction for the motion in the mouth region (1% of the overall output bit rate of a P frame) and provide the same motion-compensated image quality in high picture formats. We show performance evaluations that determine the optimal audio parameters, such as linear predictive coefficients (LPC) or line spectrum pairs (LSP), and determine the nature of the motion parameter in each macroblock of the mouth region when using advanced prediction mode (APM) video coding
Keywords
Karhunen-Loeve transforms; audio coding; code standards; data compression; linear predictive coding; motion compensation; principal component analysis; speech intelligibility; synchronisation; tree data structures; unsupervised learning; vector quantisation; video coding; APM video coding; G.723.1; H.263; Karhunen-Loeve expansions; LPC; LSP; Linde-Buzo-Gray techniques; PCA; TSVQ; advanced prediction mode; audio coding; coding standards; competitive learning; compression enhancement; joint audio video coding; line spectrum pairs; linear predictive coefficients; low-bit rate encoding; macroblock; motion compensated image quality; mouth region; performance evaluations; principal component analysis; speech intelligibility; synchronization; tree-structured vector quantization algorithms; video compression; Audio coding; Bit rate; Encoding; Motion analysis; Mouth; Principal component analysis; Region 1; Vector quantization; Video coding; Video compression;
fLanguage
English
Publisher
ieee
Conference_Titel
Image Analysis and Interpretation, 2002. Proceedings. Fifth IEEE Southwest Symposium on
Conference_Location
Sante Fe, NM
Print_ISBN
0-7695-1537-1
Type
conf
DOI
10.1109/IAI.2002.999894
Filename
999894
Link To Document