• DocumentCode
    3672398
  • Title

    A dataset for Movie Description

  • Author

    Anna Rohrbach;Marcus Rohrbach;Niket Tandon;Bernt Schiele

  • Author_Institution
    Max Planck Institute for Informatics, Saarbrü
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    3202
  • Lastpage
    3212
  • Abstract
    Audio Description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length HD movies. In addition we also collected the aligned movie scripts which have been used in prior work and compare the two different sources of descriptions. In total the MPII Movie Description dataset (MPII-MD) contains a parallel corpus of over 68K sentences and video snippets from 94 HD movies. We characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are far more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production.
  • Keywords
    "Motion pictures","Visualization","Semantics","Feature extraction","High definition video","Production","Adaptation models"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
  • Electronic_ISBN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2015.7298940
  • Filename
    7298940