A new active labeling method for deep learning

Author

Dan Wang ; Yi Shang

Author_Institution

Dept. of Comput. Sci., Univ. of Missouri, Columbia, MO, USA

fYear

2014

fDate

6-11 July 2014

Firstpage

112

Lastpage

119

Abstract

Deep learning has been shown to achieve outstanding performance in a number of challenging real-world applications. However, most of the existing works assume a fixed set of labeled data, which is not necessarily true in real-world applications. Getting labeled data is usually expensive and time consuming. Active labelling in deep learning aims at achieving the best learning result with a limited labeled data set, i.e., choosing the most appropriate unlabeled data to get labeled. This paper presents a new active labeling method, AL-DL, for cost-effective selection of data to be labeled. AL-DL uses one of three metrics for data selection: least confidence, margin sampling, and entropy. The method is applied to deep learning networks based on stacked restricted Boltzmann machines, as well as stacked autoencoders. In experiments on the MNIST benchmark dataset, the method outperforms random labeling consistently by a significant margin.

Keywords

Boltzmann machines; learning (artificial intelligence); AL-DL; MNIST benchmark dataset; active labeling method; data selection; deep learning networks; least confidence; margin sampling; stacked autoencoders; stacked restricted Boltzmann machines; Classification algorithms; Entropy; Labeling; Measurement; Neural networks; Training; Uncertainty;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks (IJCNN), 2014 International Joint Conference on

Conference_Location

Beijing

Print_ISBN

978-1-4799-6627-1

Type

conf

DOI

10.1109/IJCNN.2014.6889457

Filename

6889457