DocumentCode
3413798
Title
Generating coherent natural language annotations for video streams
Author
Khan, Muhammad Usman Ghani ; Lei Zhang ; Gotoh, Yusuke
Author_Institution
Univ. of Sheffield, Sheffield, UK
fYear
2012
fDate
Sept. 30 2012-Oct. 3 2012
Firstpage
2893
Lastpage
2896
Abstract
This contribution addresses generation of natural language annotations for human actions, behaviour and their interactions with other objects observed in video streams. The work starts with implementation of conventional image processing techniques to extract high level features for individual frames. Natural language description of the frame contents is produced based on high level features. Although feature extraction processes are erroneous at various levels, we explore approaches to put them together to produce a coherent description. For extending the approach to description of video streams, units of features are introduced to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. Evaluation is made by calculating ROUGE scores between human annotated and machine generated descriptions.
Keywords
feature extraction; natural language processing; video signal processing; video streaming; ROUGE scores; feature extraction; human action; human annotated description; image processing; machine generated description; natural language annotation; natural language description; video stream; Feature extraction; Humans; Legged locomotion; Natural languages; Streaming media; Video sequences; Visualization; Natural language description; Video annotation; Video processing; video feature units;
fLanguage
English
Publisher
ieee
Conference_Titel
Image Processing (ICIP), 2012 19th IEEE International Conference on
Conference_Location
Orlando, FL
ISSN
1522-4880
Print_ISBN
978-1-4673-2534-9
Electronic_ISBN
1522-4880
Type
conf
DOI
10.1109/ICIP.2012.6467504
Filename
6467504
Link To Document