Title of article :
Video Prediction Using Multi-Scale Deep Neural Networks
Author/Authors :
Shayanfar ، Nima Computer engineering department - Yazd University , Derhami ، Vali Computer engineering department - Yazd University , Rezaeian ، Mehdi Computer engineering department - Yazd University
From page :
423
To page :
431
Abstract :
In video prediction, it is expected to predict the next frame of a video by providing a sequence of input frames. Whereas numerous studies exist that tackle frame prediction, a suitable performance is not still achieved, and therefore, the application is an open problem. In this work, multi-scale processing is studied for video prediction, and a new network architecture for multi-scale processing is presented. This architecture is in the broad family of autoencoders. It is comprised of an encoder and decoder. A pretrained VGG is used as an encoder that processes a pyramid of input frames at multiple scales simultaneously. The decoder is based on the 3D convolutional neurons. The presented architecture is studied using three different datasets with varying degrees of difficulty. In addition, the proposed approach is compared with two conventional autoencoders. It is observed that using the pretrained network and multi-scale processing results in a performant approach.
Keywords :
deep learning , Convolutional autoencoder , Video prediction , multiscale processing
Journal title :
Journal of Artificial Intelligence and Data Mining
Journal title :
Journal of Artificial Intelligence and Data Mining
Record number :
2733672
Link To Document :
بازگشت