مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1763614

Title :

Tensor Deep Stacking Networks

Author :

Hutchinson, Brian ; Li Deng ; Dong Yu

Author_Institution :

Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA

Volume :

Issue :

fYear :

2013

fDate :

Aug. 2013

Firstpage :

1944

Lastpage :

1957

Abstract :

A novel deep architecture, the tensor deep stacking network (T-DSN), is presented. The T-DSN consists of multiple, stacked blocks, where each block contains a bilinear mapping from two hidden layers to the output layer, using a weight tensor to incorporate higher order statistics of the hidden binary (([0,1])) features. A learning algorithm for the T-DSN´s weight matrices and tensors is developed and described in which the main parameter estimation burden is shifted to a convex subproblem with a closed-form solution. Using an efficient and scalable parallel implementation for CPU clusters, we train sets of T-DSNs in three popular tasks in increasing order of the data size: handwritten digit recognition using MNIST (60k), isolated state/phone classification and continuous phone recognition using TIMIT (1.1 m), and isolated phone classification using WSJ0 (5.2 m). Experimental results in all three tasks demonstrate the effectiveness of the T-DSN and the associated learning methods in a consistent manner. In particular, a sufficient depth of the T-DSN, a symmetry in the two hidden layers structure in each T-DSN block, our model parameter learning algorithm, and a softmax layer on top of T-DSN are shown to have all contributed to the low error rates observed in the experiments for all three tasks.

Keywords :

convex programming; handwritten character recognition; higher order statistics; image classification; learning (artificial intelligence); matrix algebra; parameter estimation; pattern classification; tensors; CPU clusters; T-DSN weight matrices; bilinear mapping; closed-form solution; convex subproblem; hidden binary features; higher-order statistics; parallel implementation; parameter estimation; tensor deep stacking networks; weight tensor; Closed-form solutions; Computer architecture; Machine learning; Stacking; Tensile stress; Training; Vectors; Deep learning; MNIST; TIMIT; WSJ; bilinear models; handwriting image classification; phone classification and recognition; stacking networks; tensor;

fLanguage :

English

Journal_Title :

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Publisher :

ieee

ISSN :

0162-8828

Type :

jour

DOI :

10.1109/TPAMI.2012.268

Filename :

6389679

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1763614