مرکز منطقه ای اطلاع رساني علوم و فناوري - Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription

DocumentCode :

117977

Title :

Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription

Author :

Mimura, Masato ; Kawahara, Tatsuya

Author_Institution :

Acad. Center for Comput. & Media Studies, Kyoto Univ., Kyoto, Japan

fYear :

2014

fDate :

9-12 Dec. 2014

Firstpage :

Lastpage :

Abstract :

Unsupervised speaker adaptation of Deep Neural Network (DNN) is investigated for lecture transcription tasks, in which a single speaker gives a long speech and thus speaker adaptation is important. The proposed method selects similar speakers to the test data (test speaker) from the training database, which are used for retraining the baseline DNN. Several speaker characteristic features are defined for the speaker similarity measure. The feature based on Universal Background Model (UBM) and principal component analysis (PCA) achieves the best performance, resulting in a significant improvement from the baseline DNN and also from the adapted GMM-HMM system. The method is combined with a naive adaptation method using the initial ASR hypothesis of the test data, and an additional improvement is achieved.

Keywords :

audio databases; neural nets; principal component analysis; speaker recognition; DNN-HMM; Deep Neural Network; PCA; UBM; lecture transcription; principal component analysis; selecting similar speakers; test data; test speaker; training database; universal background model; unsupervised speaker adaptation; Accuracy; Adaptation models; Databases; Hidden Markov models; Speech; Speech recognition; Training;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)

Conference_Location :

Siem Reap

Type :

conf

DOI :

10.1109/APSIPA.2014.7041567

Filename :

7041567

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=117977