Title :
Feature-level data fusion for bimodal person recognition
Author :
Chibelushi, C.C. ; Mason, J.S.D. ; Deravi, F.
Author_Institution :
Staffordshire Univ., UK
Abstract :
Consistently high person recognition accuracy is difficult to attain using a single recognition modality. This paper assesses the fusion of voice and outer lip-margin features for person identification. Feature fusion is investigated in the form of audio-visual feature vector concatenation, principal component analysis, and linear discriminant analysis. The paper shows that, under mismatched test and training conditions, audio-visual feature fusion is equivalent to an effective increase in the signal-to-noise ratio of the audio signal. Audio-visual feature vector concatenation is shown to be an effective method for feature combination, and linear discriminant analysis is shown to possess the capability of packing discriminating audio-visual information into fewer coefficients than principal component analysis. The paper reveals a high sensitivity of bimodal person identification to a mismatch between LDA or PCA feature-fusion module and speaker model training noise-conditions. Such a mismatch leads to worse identification accuracy than unimodal identification
Keywords :
face recognition; audio signal; audio-visual feature fusion; audio-visual feature vector concatenation; bimodal person recognition; coefficients; feature level data fusion; feature-fusion module; identification accuracy; linear discriminant analysis; mismatched test conditions; mismatched training conditions; outer lip-margin features; person recognition accuracy; principal component analysis; signal to noise ratio; speaker model training noise-conditions; unimodal identification; voice features;
Conference_Titel :
Image Processing and Its Applications, 1997., Sixth International Conference on
Conference_Location :
Dublin
Print_ISBN :
0-85296-692-X
DOI :
10.1049/cp:19970924