DocumentCode
43048
Title
On the Complexity of Neural Network Classifiers: A Comparison Between Shallow and Deep Architectures
Author
Bianchini, Monica ; Scarselli, Franco
Author_Institution
Univ. of Siena, Siena, Italy
Volume
25
Issue
8
fYear
2014
fDate
Aug. 2014
Firstpage
1553
Lastpage
1565
Abstract
Recently, researchers in the artificial neural network field have focused their attention on connectionist models composed by several hidden layers. In fact, experimental results and heuristic considerations suggest that deep architectures are more suitable than shallow ones for modern applications, facing very complex problems, e.g., vision and human language understanding. However, the actual theoretical results supporting such a claim are still few and incomplete. In this paper, we propose a new approach to study how the depth of feedforward neural networks impacts on their ability in implementing high complexity functions. First, a new measure based on topological concepts is introduced, aimed at evaluating the complexity of the function implemented by a neural network, used for classification purposes. Then, deep and shallow neural architectures with common sigmoidal activation functions are compared, by deriving upper and lower bounds on their complexity, and studying how the complexity depends on the number of hidden units and the used activation function. The obtained results seem to support the idea that deep networks actually implements functions of higher complexity, so that they are able, with the same number of resources, to address more difficult problems.
Keywords
computational complexity; feedforward neural nets; pattern classification; topology; artificial neural network; classification; deep architecture; deep network; feedforward neural network; function complexity evaluation; hidden layers; high complexity functions; human language understanding; neural network classifiers; shallow architecture; sigmoidal activation function; topological concepts; vision; Biological neural networks; Complexity theory; Computer architecture; Neurons; Polynomials; Upper bound; Betti numbers; Vapnik--Chervonenkis dimension (VC-dim).; Vapnik??Chervonenkis dimension (VC-dim); deep neural networks; function approximation; topological complexity;
fLanguage
English
Journal_Title
Neural Networks and Learning Systems, IEEE Transactions on
Publisher
ieee
ISSN
2162-237X
Type
jour
DOI
10.1109/TNNLS.2013.2293637
Filename
6697897
Link To Document