Title :
Recognizing manipulation actions in arts and crafts shows using domain-specific visual and textual cues
Author :
Sapp, Brian ; Chaudhry, Rizwan ; Xiaodong Yu ; Singh, Gagan ; Perera, Indika ; Ferraro, F. ; Tzoukermann, E. ; Kosecka, Jana ; Neumann, Jorg
Author_Institution :
Univ. of Pennsylvania, Philadelphia, PA, USA
Abstract :
We present an approach for automatic annotation of commercial videos from an arts-and-crafts domain with the aid of textual descriptions. The main focus is on recognizing both manipulation actions (e.g. cut, draw, glue) and the tools that are used to perform these actions (e.g. markers, brushes, glue bottle). We demonstrate how multiple visual cues such as motion descriptors, object presence, and hand poses can be combined with the help of contextual priors that are automatically extracted from associated transcripts or online instructions. Using these diverse features and linguistic information we propose several increasingly complex computational models for recognizing elementary manipulation actions and composite activities, as well as their temporal order. The approach is evaluated on a novel dataset of comprised of 27 episodes of PBS Sprout TV, each containing on average 8 manipulation actions.
Keywords :
art; feature extraction; image motion analysis; video retrieval; video signal processing; art; automatic annotation; commercial video; craft; domain-specific textual cue; domain-specific visual cue; hand pose; linguistic information; manipulation action; motion descriptor; multiple visual cues; object presence; Computational modeling; Educational institutions; Feature extraction; Humans; Internet; USA Councils; Videos;
Conference_Titel :
Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4673-0062-9
DOI :
10.1109/ICCVW.2011.6130435