Rényi Divergence Based Generalization for Learning of Classification Restricted Boltzmann Machines

Author

Qian Yu ; Yuexian Hou ; Xiaozhao Zhao ; Guochen Cheng

Author_Institution

Sch. of Comput. Software, Tianjin Univ., Tianjin, China

fYear

2014

fDate

14-14 Dec. 2014

Firstpage

692

Lastpage

697

Abstract

As a derivative of Restricted Boltzmann Machine (RBM), classification RBM (Class RBM) is proved to be an effective classifier with a probabilistic interpretation. Several elegant learning methods/models related to Class RBM have been proposed. This paper proposes and analyzes a Rényi divergence based generalization for discriminative learning objective of Class RBM. Specifically, we extend the Conditional Log Likelihood (CLL) objective to a general learning criterion. We prove that, some existing popular training methods can be derived from this generalization, via adjusting the parameters to specific values. Intuitively, the regularization with different settings of parameters constrain the learned RBM distribution in different ways, and the parameter setting that provide a suitable distribution constraints for a particular sample set leads to the optimal performance. Moreover, we show that this generalized criterion actually extends the CLL objective with a Rényi divergence-based regularization. The uniform distribution used in this divergence-based regularization can be replaced by some sample-based distribution. This modification is applicable to any specific case of the general objective, and we call the appended loss as general margin. The proposed generalization enables an effective model selection procedure and experiments on human face recognition and document classification achieved significant performance improvement over the existing learning methods. It is also shown empirically that general margin loss is able to stabilize the parameter sensitivity and further improve the performance of the classifiers.

Keywords

Boltzmann machines; document handling; face recognition; generalisation (artificial intelligence); learning (artificial intelligence); pattern classification; statistical distributions; CLL objective; Class RBM; Rényi divergence based generalization; Rényi divergence-based regularization; classification RBM; classification restricted Boltzmann machines; conditional log likelihood objective; discriminative learning objective; distribution constraint; document classification; general learning criterion; human face recognition; learning method; learning model; parameter sensitivity; probabilistic interpretation; sample-based distribution; training method; Databases; Educational institutions; Error analysis; Face recognition; Learning systems; Linear programming; Training; Classification RBM; Discriminative Learning; Rényi Divergence;

fLanguage

English

Publisher

ieee

Conference_Titel

Data Mining Workshop (ICDMW), 2014 IEEE International Conference on

Conference_Location

Shenzhen

Print_ISBN

978-1-4799-4275-6

Type

conf

DOI

10.1109/ICDMW.2014.17

Filename

7022663