DocumentCode :
3863821
Title :
Improved deep speaker feature learning for text-dependent speaker recognition
Author :
Lantian Li;Yiye Lin;Zhiyong Zhang;Dong Wang
fYear :
2015
Firstpage :
426
Lastpage :
429
Abstract :
A deep learning approach has been proposed recently to derive speaker identifies (d-vector) by a deep neural network (DNN). This approach has been applied to text-dependent speaker recognition tasks and shows reasonable performance gains when combined with the conventional i-vector approach. Although promising, the existing d-vector implementation still can not compete with the i-vector baseline. This paper presents two improvements for the deep learning approach: a phone-dependent DNN structure to normalize phone variation, and a new scoring approach based on dynamic time warping (DTW). Experiments on a text-dependent speaker recognition task demonstrated that the proposed methods can provide considerable performance improvement over the existing d-vector implementation.
Keywords :
"Speaker recognition","Training","Mel frequency cepstral coefficient","Data models","Machine learning","Training data"
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
Type :
conf
DOI :
10.1109/APSIPA.2015.7415306
Filename :
7415306
Link To Document :
بازگشت