• DocumentCode
    3673913
  • Title

    Action classification in still images using human eye movements

  • Author

    Gary Ge;Kiwon Yun;Dimitris Samaras;Gregory J. Zelinsky

  • Author_Institution
    Ward Melville High School, East Setauket, NY 11733, USA
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    16
  • Lastpage
    23
  • Abstract
    Despite recent advances in computer vision, image categorization aimed at recognizing the semantic category of an image such as scene, objects or actions remains one of the most challenging tasks in the field. However, human gaze behavior can be harnessed to recognize different classes of actions for automated image understanding. To quantify the spatio-temporal information in gaze we use segments in each image (person, upper-body, lower-body, context) and derive gaze features, which include: number of transitions between segment pairs, avg/max of fixation-density map per segment, dwell time per segment, and a measure of when fixations were made on the person versus the context. We evaluate our gaze features on a subset of images from the challenging PASCAL VOC 2012 Action Classes dataset, while visual features using a Convolutional Neural Network are obtained as a baseline. Two support vector machine classifiers are trained, one with the gaze features and the other with the visual features. Although the baseline classifier outperforms the gaze classifier for classification of 10 actions, analysis of classification results over reveals four behaviorally meaningful action groups where classes within each group are often confused by the gaze classifier. When classifiers are retrained to discriminate between these groups, the performance of the gaze classifier improves significantly relative to the baseline. Furthermore, combining gaze and the baseline outperforms both gaze alone and the baseline alone, suggesting both are contributing to the classification decision and illustrating how gaze can improve state of the art methods of automated action classification.
  • Keywords
    "Image segmentation","Context","Visualization","Support vector machines","Image recognition","Computer vision","Training"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 IEEE Conference on
  • Electronic_ISBN
    2160-7516
  • Type

    conf

  • DOI
    10.1109/CVPRW.2015.7301288
  • Filename
    7301288