Title :
Audiovisual speech/speaker recognition, application to Arabic language
Author :
Chelali, Fatma Zohra ; Djeradi, Amar
Author_Institution :
Speech Commun. & Signal Process. Lab., Univ. of Sci. & Technol. Houari Boumedienne, Algiers, Algeria
Abstract :
Audio-only speaker/speech recognition systems ASR are far from being perfect especially under noisy conditions. Furthermore, it is a known fact that the content of speech can be revealed partially through lip-reading. Human speech perception is bimodal in nature: Humans combine audio and visual information in deciding what has been spoken, especially in noisy environments. In this paper, we describe a speaker identification system where lip information is fused with corresponding speech information from each speaker. The energy, the zero cross ratio (ZCR) and the pitch are used as features for the audio modality. The features for the lip texture modality are 2D-DCT coefficients of the luminance component. Intuitively, we would expect lip information to be somewhat complementary to speech information due to the range of lip movements associated with the production of the corresponding phonemes in speech using a multilayer perceptron classifier.
Keywords :
discrete cosine transforms; multilayer perceptrons; natural language processing; signal classification; speaker recognition; speech intelligibility; 2D-DCT coefficient; Arabic language; audio information; audio modality; audio-only speaker recognition; audiovisual speech recognition; human speech perception; lip movement; lip texture modality; lip-reading; luminance component; multilayer perceptron classifier; noisy environment; phoneme; pitch; speaker identification system; speech content; speech intelligibility; speech recognition system; visual information; zero cross ratio; Correlation; Discrete cosine transforms; Feature extraction; Mouth; Speech; Speech recognition; Visualization; Arabic language; Viseme classification for Arabic visual speech recognition; speaker recognition; speech recognition;
Conference_Titel :
Multimedia Computing and Systems (ICMCS), 2011 International Conference on
Conference_Location :
Ouarzazate
Print_ISBN :
978-1-61284-730-6
DOI :
10.1109/ICMCS.2011.5945713