DocumentCode
293490
Title
Learning by α-divergence
Author
Kamimura, Ryotaro ; Nakanishi, Shohachiro
Author_Institution
Inf. Sci. Lab., Tokai Univ., Kanagawa, Japan
Volume
3
fYear
1995
fDate
20-24 Mar 1995
Firstpage
1535
Abstract
In the present paper, we propose a new cost function, called α-divergence, which is a generalized version of the relative entropy or the Kullback´s divergence measure in neural network. The most fundamental characteristics of this α-divergence are summarized by the following three points: 1) by changing the parameter α for the α-divergence, multiple cost functions can be obtained to be used for different purposes or problems; 2) α-divergence is effective in direct proportion to the error between targets and outputs, eliminating the derivative of the sigmoidal function; and 3) the α-divergence has an effect on eliminating saturated units. We formulated an update rule to minimize α-divergence, and applied the method to the acquisition of the grammatical competence. Experimental results confirmed marked improvement in the generalization by using α-divergence. This improvement is due to the property of α-divergence whose derivative is effective especially for eliminating saturated units
Keywords
convergence of numerical methods; entropy; generalisation (artificial intelligence); grammars; learning (artificial intelligence); neural nets; α-divergence learning; Kullback´s divergence measure; cost function; generalization; grammatical competence; neural network; relative entropy; saturated units; sigmoidal function; Convergence; Cost function; Entropy; Information science; Laboratories; Neural networks; Supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems, 1995. International Joint Conference of the Fourth IEEE International Conference on Fuzzy Systems and The Second International Fuzzy Engineering Symposium., Proceedings of 1995 IEEE Int
Conference_Location
Yokohama
Print_ISBN
0-7803-2461-7
Type
conf
DOI
10.1109/FUZZY.1995.409882
Filename
409882
Link To Document