DocumentCode
1524957
Title
Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech
Author
De Leon, Phillip L. ; Pucher, Michael ; Yamagishi, Junichi ; Hernaez, Inma ; Saratxaga, Ibon
Author_Institution
Klipsch School of Electrical and Computer Engineering, New Mexico State University (NMSU), Las Cruces, NM, USA
Volume
20
Issue
8
fYear
2012
Firstpage
2280
Lastpage
2290
Abstract
In this paper, we evaluate the vulnerability of speaker verification (SV) systems to synthetic speech. The SV systems are based on either the Gaussian mixture model–universal background model (GMM-UBM) or support vector machine (SVM) using GMM supervectors. We use a hidden Markov model (HMM)-based text-to-speech (TTS) synthesizer, which can synthesize speech for a target speaker using small amounts of training data through model adaptation of an average voice or background model. Although the SV systems have a very low equal error rate (EER), when tested with synthetic speech generated from speaker models derived from the Wall Street Journal (WSJ) speech corpus, over 81% of the matched claims are accepted. This result suggests vulnerability in SV systems and thus a need to accurately detect synthetic speech. We propose a new feature based on relative phase shift (RPS), demonstrate reliable detection of synthetic speech, and show how this classifier can be used to improve security of SV systems.
Keywords
Adaptation models; Hidden Markov models; Speech; Support vector machines; Synthesizers; Training; Vectors; Security; speaker recognition; speech synthesis;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2012.2201472
Filename
6205335
Link To Document