• DocumentCode
    3401011
  • Title

    CNN-based shot boundary detection and video annotation

  • Author

    Wenjing Tong ; Li Song ; Xiaokang Yang ; Hui Qu ; Rong Xie

  • Author_Institution
    Inst. of Image Commun. & Network Eng., Shanghai Jiao Tong Univ., Shanghai, China
  • fYear
    2015
  • fDate
    17-19 June 2015
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    With the explosive growth of video data, content-based video analysis and management technologies such as indexing, browsing and retrieval have drawn much attention. Video shot boundary detection (SBD) is usually the first and important step for those technologies. Great efforts have been made to improve the accuracy of SBD algorithms. However, most works are based on signal rather than interpretable features of frames. In this paper, we propose a novel video shot boundary detection framework based on interpretable TAGs learned by Convolutional Neural Networks (CNNs). Firstly, we adopt a candidate segment selection to predict the positions of shot boundaries and discard most non-boundary frames. This preprocessing method can help to improve both accuracy and speed of the SBD algorithm. Then, cut transition and gradual transition detections which are based on the interpretable TAGs are conducted to identify the shot boundaries in the candidate segments. Afterwards, we synthesize the features of frames in a shot and get semantic labels for the shot. Experiments on TRECVID 2001 test data show that the proposed scheme can achieve a better performance compared with the state-of-the-art schemes. Besides, the semantic labels obtained by the framework can be used to depict the content of a shot.
  • Keywords
    neural nets; video coding; CNN; TRECVID 2001 test data; content-based video analysis; convolutional neural networks; management technologies; semantic labels; video annotation; video data; video shot boundary detection; Accuracy; Computed tomography; Feature extraction; Indexing; Mathematical model; Neural networks; Semantics; Retrieval and indexing; convolutional neural networks; deep learning; shot boundary detection; video coding and processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Broadband Multimedia Systems and Broadcasting (BMSB), 2015 IEEE International Symposium on
  • Conference_Location
    Ghent
  • Type

    conf

  • DOI
    10.1109/BMSB.2015.7177222
  • Filename
    7177222