مرکز منطقه ای اطلاع رساني علوم و فناوري - Semi-supervised training of Deep Neural Networks

DocumentCode :

672364

Title :

Semi-supervised training of Deep Neural Networks

Author :

Vesely, Karel ; Hannemann, Mirko ; Burget, Lukas

Author_Institution :

Speech@FIT & IT4I Center of Excellence, Brno Univ. of Technol., Brno, Czech Republic

fYear :

2013

fDate :

8-12 Dec. 2013

Firstpage :

267

Lastpage :

272

Abstract :

In this paper we search for an optimal strategy for semi-supervised Deep Neural Network (DNN) training. We assume that a small part of the data is transcribed, while the majority of the data is untranscribed. We explore self-training strategies with data selection based on both the utterance-level and frame-level confidences. Further on, we study the interactions between semi-supervised frame-discriminative training and sequence-discriminative sMBR training. We found it beneficial to reduce the disproportion in amounts of transcribed and untranscribed data by including the transcribed data several times, as well as to do a frame-selection based on per-frame confidences derived from confusion in a lattice. For the experiments, we used the Limited language pack condition for the Surprise language task (Vietnamese) from the IARPA Babel program. The absolute Word Error Rate (WER) improvement for frame cross-entropy training is 2.2%, this corresponds to WER recovery of 36% when compared to the identical system, where the DNN is built on the fully transcribed data.

Keywords :

entropy; learning (artificial intelligence); natural language processing; neural nets; DNN training; IARPA Babel program; Surprise language task; Vietnamese language; WER recovery; absolute WER improvement; absolute word error rate improvement; data selection; disproportion reduction; frame cross-entropy training; frame-level confidence; limited language pack condition; optimal strategy; semisupervised deep-neural network training; semisupervised frame-discriminative training; sequence-discriminative sMBR training; transcribed data; untranscribed data; utterance-level confidence; Acoustics; Data models; Lattices; Maximum likelihood decoding; Speech; Training; Training data; Babel program; DNN; deep network; self-training; semi-supervised training;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location :

Olomouc

Type :

conf

DOI :

10.1109/ASRU.2013.6707741

Filename :

6707741

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=672364