Title :
Learning convolutional neural networks from few samples
Author :
Wagner, Rene ; Thom, Markus ; Schweiger, Roland ; Palm, Gunther ; Rothermel, Albrecht
Author_Institution :
Inst. of Microelectron., Univ. of Ulm, Ulm, Germany
Abstract :
Learning Convolutional Neural Networks (CNN) is commonly carried out by plain supervised gradient descent. With sufficient training data, this leads to very competitive results for visual recognition tasks when starting from a random initialization. When the amount of labeled data is limited, CNNs reveal their strong dependence on large amounts of training data. However, recent results have shown that a well chosen optimization starting point can be beneficial for convergence to a good generalizing minimum. This starting point was mostly found using unsupervised feature learning techniques such as sparse coding or transfer learning from related recognition tasks. In this work, we compare these two approaches against a simple patch based initialization scheme and a random initialization of the weights. We show that pre-training helps to train CNNs from few samples and that the correct choice of the initialization scheme can push the network´s performance by up to 41% compared to random initialization.
Keywords :
convolution; gradient methods; image coding; image recognition; neural net architecture; optimisation; unsupervised learning; CNN; convolutional neural network architecture; labeled data; network performance; optimization; patch based initialization scheme; random initialization; sparse coding; supervised gradient descent; training data; transfer learning; unsupervised feature learning techniques; visual recognition tasks; Biological neural networks; Feature extraction; Kernel; Principal component analysis; Training; Training data; Visualization;
Conference_Titel :
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4673-6128-6
DOI :
10.1109/IJCNN.2013.6706969