DocumentCode :
2418662
Title :
A speech-video synchrony quality metric using CoIA
Author :
Wei Yaodu ; Xie Xiang ; Kuang Jingming ; Han Xinlu
Author_Institution :
Dept. of Electron. Eng., Beijing Inst. of Technol., Beijing, China
fYear :
2010
fDate :
13-14 Dec. 2010
Firstpage :
173
Lastpage :
177
Abstract :
A quality model was built to assess the influence of speech-video asynchrony on the audio-visual quality perception. The audio-visual contents were separated into two categories: “speaker inside” and “speaker outside”, depending on whether the speaker is inside the video. For the first category, speech was shifted in a small scale. DCT and MFCC coefficients were calculated from video and speech separately. A Co-inertia Analysis (CoIA) was used to decide the speech-video correlation, and as the speech progressively shifts, a correlation curve emerged. The curve was modeled by an Gaussian function, and then the function was used to predict the perceptual quality. On the other hand, a Gaussian curve was used to predict the perceptual quality of the “speaker outside” category. A subjective test proved the effectiveness of the proposed method.
Keywords :
Gaussian processes; audio-visual systems; correlation methods; discrete cosine transforms; speech processing; video signal processing; COIA; DCT coefficient; Gaussian function; MFCC coefficient; audio visual content; audio visual quality perception; coinertia analysis; correlation curve; speech video correlation; speech video synchrony quality; Correlation; Hidden Markov models; Mouth; Speech; Streaming media; Synchronization; Audio-visual quality; QVGA; asynchrony; co-inertia analysis; speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Packet Video Workshop (PV), 2010 18th International
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-9522-1
Electronic_ISBN :
978-1-4244-9520-7
Type :
conf
DOI :
10.1109/PV.2010.5706835
Filename :
5706835
Link To Document :
بازگشت