Title :
Generating natural language description of human behavior from video images
Author :
Kojima, Atsuhiro ; Izumi, Masao ; Tamura, Takeshi ; Fukunaga, Kunio
Author_Institution :
Libr. & Sci. Inf. Center, Osaka Prefecture Univ., Japan
Abstract :
In visual surveillance applications, it is becoming popular to perceive video images and to interpret them using natural language concepts. We propose an approach to generating a natural language description of human behavior appearing in real video images. First, a head region of a human, on behalf of the whole body, is extracted from each frame. Using a model based method, three dimensional pose and position of the head are estimated. Next, the trajectory of these parameters is divided into segments of monotonous motions. For each segment, we evaluate conceptual features such as degree of change of pose and position and that of relative distance to some objects in the surroundings, and so on. By calculating the product of these feature values, a most suitable verb is selected and other syntactic elements are supplied. Finally natural language text is generated using a technique of machine translation
Keywords :
image motion analysis; language translation; natural languages; surveillance; conceptual features; head region; human behavior; machine translation; model based method; monotonous motions; natural language description; natural language text; pose estimation; position estimation; syntactic elements; verb; video image; visual surveillance; AC generators; Biological system modeling; Humans; Image edge detection; Image segmentation; Layout; Magnetic heads; Natural languages; Surveillance; Vehicles;
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
0-7695-0750-6
DOI :
10.1109/ICPR.2000.903020