Title :
Speaker verification with deep features
Author :
Yuan Liu ; Tianfan Fu ; Yuchen Fan ; Yanmin Qian ; Kai Yu
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
Due to great success of deep learning in speech recognition, there has been interest of applying deep learning to speaker verification. Previous investigations usually focus on using deep neural network as new classifiers or to extract speaker dependent features. They are either not compatible with existing speaker verification approaches, or not able to achieve significant performance gain in large scale tasks. Also, all the previous approaches have not addressed the issue of how to make use of extra unsupervised data. This paper proposes a novel feature engineering approach within the deep learning framework for speaker verification. Hidden layer output of deep neural network or deep belief network trained on large amount of speech recognition data are extracted as deep features. These features are then used in a Tandem fashion or concatenated with the original acoustic features for GMM-UBM speaker verification. The proposed approach can make use of large amount of existing speech recognition data without speaker labels and is easy to be combined with other mature classification approaches. Experiments on the core condition of NIST 2006 SRE showed that, in a text independent task, the proposed approach can achieve 12.8% relative EER improvement compared to the standard GMM-UBM systems. In addition, text-dependent speaker verification experiments were also performed and yielded similar significant gain.
Keywords :
belief networks; feature extraction; learning (artificial intelligence); neural nets; speaker recognition; EER improvement; GMM-UBM speaker verification; NISI 2006 SRE; deep belief network; deep feature extraction; deep learning framework; deep neural networks hidden layer output; feature engineering approach; speech recognition data; text independent speaker verification; text-dependent speaker verification; Acoustics; Feature extraction; Hidden Markov models; Neural networks; Speaker recognition; Speech; Speech recognition;
Conference_Titel :
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6627-1
DOI :
10.1109/IJCNN.2014.6889708