DocumentCode
2871634
Title
Generating natural language description of human behavior from video images
Author
Kojima, Atsuhiro ; Izumi, Masao ; Tamura, Takeshi ; Fukunaga, Kunio
Author_Institution
Libr. & Sci. Inf. Center, Osaka Prefecture Univ., Japan
Volume
4
fYear
2000
fDate
2000
Firstpage
728
Abstract
In visual surveillance applications, it is becoming popular to perceive video images and to interpret them using natural language concepts. We propose an approach to generating a natural language description of human behavior appearing in real video images. First, a head region of a human, on behalf of the whole body, is extracted from each frame. Using a model based method, three dimensional pose and position of the head are estimated. Next, the trajectory of these parameters is divided into segments of monotonous motions. For each segment, we evaluate conceptual features such as degree of change of pose and position and that of relative distance to some objects in the surroundings, and so on. By calculating the product of these feature values, a most suitable verb is selected and other syntactic elements are supplied. Finally natural language text is generated using a technique of machine translation
Keywords
image motion analysis; language translation; natural languages; surveillance; conceptual features; head region; human behavior; machine translation; model based method; monotonous motions; natural language description; natural language text; pose estimation; position estimation; syntactic elements; verb; video image; visual surveillance; AC generators; Biological system modeling; Humans; Image edge detection; Image segmentation; Layout; Magnetic heads; Natural languages; Surveillance; Vehicles;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location
Barcelona
ISSN
1051-4651
Print_ISBN
0-7695-0750-6
Type
conf
DOI
10.1109/ICPR.2000.903020
Filename
903020
Link To Document