مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-stream Asynchrony Modeling for Audio-Visual Speech Recognition

DocumentCode :

2519315

Title :

Multi-stream Asynchrony Modeling for Audio-Visual Speech Recognition

Author :

Lv, Guoyun ; Jiang, Dongmei ; Zhao, Rongchun ; Hou, Yunshu

Author_Institution :

Northwestern Polytech. Univ., Xian

fYear :

2007

fDate :

10-12 Dec. 2007

Firstpage :

Lastpage :

Abstract :

In this paper, two multi-stream asynchrony Dynamic Bayesian Network models (MS-ADBN model and MM-ADBN model) are proposed for audio-visual speech recognition (AVSR). The proposed models, with different topology structures, loose the asynchrony of audio and visual streams to word level. For MS-ADBN model, both in audio stream and in visual stream, each word is composed of its corresponding phones, and each phone is associated with observation vector. MM- ADBN model is an augmentation of MS-ADBN model, a level of hidden nodes--state level, is added between the phone level and the observation node level, to describe the dynamic process of phones. Essentially, MS-ADBN model is a word model, while MM-ADBN model is a phone model. Speech recognition experiments are done on a digit audio-visual (A-V) database, as well as on a continuous A-V database. The results demonstrate that the asynchrony description between audio and visual stream is important for AVSR system, and MM-ADBN model has the best performance for the task of continuous A-V speech recognition.

Keywords :

audio-visual systems; belief networks; speech recognition; audio streams; audio-visual speech recognition; continuous A-V database; digit audio-visual database; dynamic Bayesian network models; hidden node; multistream asynchrony modeling; observation node level; visual streams; Audio databases; Background noise; Bayesian methods; Hidden Markov models; Speech enhancement; Speech recognition; Streaming media; Visual databases; Vocabulary; Working environment noise;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia, 2007. ISM 2007. Ninth IEEE International Symposium on

Conference_Location :

Taichung

Print_ISBN :

978-0-7695-3058-1

Type :

conf

DOI :

10.1109/ISM.2007.4412354

Filename :

4412354

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2519315