• DocumentCode
    111779
  • Title

    Enhancing Video Event Recognition Using Automatically Constructed Semantic-Visual Knowledge Base

  • Author

    Xishan Zhang ; Yang Yang ; Yongdong Zhang ; Huanbo Luan ; Jintao Li ; Hanwang Zhang ; Tat-Seng Chua

  • Author_Institution
    Key Lab. of Intell. Inf. Process., Inst. of Comput. Technol., Beijing, China
  • Volume
    17
  • Issue
    9
  • fYear
    2015
  • fDate
    Sept. 2015
  • Firstpage
    1562
  • Lastpage
    1575
  • Abstract
    The task of recognizing events from video has attracted a lot of attention in recent years. However, due to the complex nature of user-defined events, the use of purely audio- visual content analysis without domain knowledge has been found to be grossly inadequate. In this paper, we propose to construct a semantic-visual knowledge base to encode the rich event-centric concepts and their relationships from the well- established lexical databases, including FrameNet, as well as the concept-specific visual knowledge from ImageNet. Based on this semantic-visual knowledge bases, we design an effective system for video event recognition. Specifically, in order to narrow the semantic gap between the high-level complex events and low-level visual representations, we utilize the event-centric semantic concepts encoded in the knowledge base as the intermediate-level event representation, which offers both human-perceivable and machine-interpretable semantic clues for event recognition. In addition, in order to leverage the abundant ImageNet images, we propose a robust transfer learning model to learn the noise- resistant concept classifiers for videos. Extensive experiments on various real-world video datasets demonstrate the superiority of our proposed system as compared to the state-of-the-art approaches.
  • Keywords
    image classification; knowledge based systems; learning (artificial intelligence); video signal processing; FrameNet; ImageNet images; audio-visual content analysis; automatically constructed semantic-visual knowledge base; concept-specific visual knowledge; event-centric semantic concept encoding; high-level complex events; human-perceivable semantic clues; intermediate-level event representation; lexical database; low-level visual representation; machine-interpretable semantic clues; multiple kernel learning algorithm; noise-resistant concept classifier; robust transfer learning model; semantic gap; user-defined events; video event recognition; Feature extraction; Knowledge based systems; Multimedia communication; Semantics; Streaming media; Vehicles; Visualization; Concept detection; event recognition; knowledge base;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2015.2449660
  • Filename
    7132742