• DocumentCode
    2082442
  • Title

    ViTex: Video To Tex and Its Application in Aerial Video Surveillance

  • Author

    Cheng, Hui ; Butler, Darren ; Basu, Chumki

  • Author_Institution
    Sarnoff Corporation, 201 Washington Rd, Princeton, NJ
  • Volume
    1
  • fYear
    2006
  • fDate
    17-22 June 2006
  • Firstpage
    586
  • Lastpage
    593
  • Abstract
    Given the huge amount of aerial surveillance video, captured daily, an automated video understanding system is needed to extract information and to generate metadata that is easy to search, browse and summarize, and which can be readily understood by an end user. In this paper, we propose a Video-To-Text engine called ViTex that automatically generates text descriptions of the content of a video. The ViTex engine first segments an input video sequence according to pre-defined semantic classes using a Mixture-of- Expert blob segmentation algorithm. The resulting segmentation is coerced into a semantic concept graph and based on domain knowledge and a semantic concept hierarchy. Then, the initial semantic concept graph is summarized and pruned. Finally, according to the summarized semantic concept graph and its changes over time, text descriptions are automatically generated using one of the three description schemes: key-frame, key-object and key-change descriptions. We have applied the ViTex engine to aerial surveillance video and compared its performance with ground-truth text descriptions generated by humans.
  • Keywords
    Data mining; Engines; Humans; Layout; National security; Natural languages; Research and development; Video sequences; Video surveillance; Videoconference;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
  • ISSN
    1063-6919
  • Print_ISBN
    0-7695-2597-0
  • Type

    conf

  • DOI
    10.1109/CVPR.2006.332
  • Filename
    1640808