Title :
In the Pursuit of Effective Affective Computing: The Relationship Between Features and Registration
Author :
Chew, S.W. ; Lucey, P. ; Lucey, S. ; Saragih, J. ; Cohn, J.F. ; Matthews, I. ; Sridharan, S.
Author_Institution :
Speech, Audio, Image & Video Technol. Lab., Queensland Univ. of Technol., Brisbane, QLD, Australia
Abstract :
For facial expression recognition systems to be applicable in the real world, they need to be able to detect and track a previously unseen person´s face and its facial movements accurately in realistic environments. A highly plausible solution involves performing a “dense” form of alignment, where 60-70 fiducial facial points are tracked with high accuracy. The problem is that, in practice, this type of dense alignment had so far been impossible to achieve in a generic sense, mainly due to poor reliability and robustness. Instead, many expression detection methods have opted for a “coarse” form of face alignment, followed by an application of a biologically inspired appearance descriptor such as the histogram of oriented gradients or Gabor magnitudes. Encouragingly, recent advances to a number of dense alignment algorithms have demonstrated both high reliability and accuracy for unseen subjects [e.g., constrained local models (CLMs)]. This begs the question: Aside from countering against illumination variation, what do these appearance descriptors do that standard pixel representations do not? In this paper, we show that, when close to perfect alignment is obtained, there is no real benefit in employing these different appearance-based representations (under consistent illumination conditions). In fact, when misalignment does occur, we show that these appearance descriptors do work well by encoding robustness to alignment error. For this work, we compared two popular methods for dense alignment-subject-dependent active appearance models versus subject-independent CLMs-on the task of action-unit detection. These comparisons were conducted through a battery of experiments across various publicly available data sets (i.e., CK+, Pain, M3, and GEMEP-FERA). We also report our performance in the recent 2011 Facial Expression Recognition and Analysis Challenge for the subject-independent task.
Keywords :
emotion recognition; face recognition; feature extraction; image registration; lighting; object tracking; realistic images; CK+ data set; Facial Expression Recognition and Analysis Challenge; GEMEP-FERA data set; Gabor magnitudes; Histogram of Oriented Gradients; M3 data set; Pain data set; action-unit detection; alignment error; appearance-based representations; biologically-inspired appearance descriptor; constrained local models; dense alignment algorithms; face alignment; facial expression recognition systems; facial movements; feature extraction; fiducial facial point tracking; illumination variation; image registration; pixel representations; publicly available data sets; realistic environments; robustness encoding; subject-dependent active appearance models; subject-independent CLM; subject-independent task; unseen person face detection; unseen person face tracking; Accuracy; Active appearance model; Face; Feature extraction; Gold; Shape; Support vector machines; Active appearance models; automatic facial expression recognition; biologically-inspired appearance descriptors; constrained local models;
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
DOI :
10.1109/TSMCB.2012.2194485