Inferring social roles in long timespan video sequence

Author

Zhang, Jiangen ; Hu, Wenze ; Yao, Benjamin ; Wang, Yongtian ; Zhu, Song-Chun

Author_Institution

Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China

fYear

2011

fDate

6-13 Nov. 2011

Firstpage

1456

Lastpage

1463

Abstract

In this paper, we present a method for inferring social roles of agents (persons) from their daily activities in long surveillance video sequences. We define activities as interactions between an agent´s position and semantic hotspots within the scene. Given a surveillance video, our method first tracks the locations of agents then automatically discovers semantic hotspots in the scene. By enumerating spatial/temporal locations between an agent´s feet and hotspots in a scene, we define a set of atomic actions, which in turn compose sub-events and events. The numbers and types of events performed by an agent are assumed to be driven by his/her social role. With the grammar model induced by composition rules, an adapted Earley parser algorithm is used to parse the trajectories into events, sub-events and atomic actions. With probabilistic output of events, the roles of agents can be predicted under the Bayesian inference framework. Experiments are carried out on a challenging 8.5 hours video from a surveillance camera in the lobby of a research lab. The video contains 7 different social roles including “manager”, “researcher”, “developer”, “engineer”, “staff”, “visitor” and “mailman”. Results show that our proposed method can predict the role of each agent with high precision.

Keywords

Bayes methods; grammars; image sequences; social sciences computing; video signal processing; video surveillance; Bayesian inference; Earley parser algorithm; atomic action; composition rule; grammar model; long timespan video sequence; semantic hotspot; social role; spatial location enumeration; surveillance video; temporal location enumeration; Atomic layer deposition; Grammar; Hidden Markov models; Semantics; Surveillance; Trajectory; Video sequences;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on

Conference_Location

Barcelona

Print_ISBN

978-1-4673-0062-9

Type

conf

DOI

10.1109/ICCVW.2011.6130422

Filename

6130422