DocumentCode :
45363
Title :
Toward a Universal Synthetic Speech Spoofing Detection Using Phase Information
Author :
Sanchez, Jon ; Saratxaga, Ibon ; Hernaez, Inma ; Navas, Eva ; Erro, Daniel ; Raitio, Tuomo
Author_Institution :
Aholab Signal Process. Lab., Univ. of the Basque Country, Bilbao, Spain
Volume :
10
Issue :
4
fYear :
2015
fDate :
Apr-15
Firstpage :
810
Lastpage :
820
Abstract :
In the field of speaker verification (SV) it is nowadays feasible and relatively easy to create a synthetic voice to deceive a speech driven biometric access system. This paper presents a synthetic speech detector that can be connected at the front-end or at the back-end of a standard SV system, and that will protect it from spoofing attacks coming from state-of-the-art statistical Text to Speech (TTS) systems. The system described is a Gaussian Mixture Model (GMM) based binary classifier that uses natural and copy-synthesized signals obtained from the Wall Street Journal database to train the system models. Three different state-of-the-art vocoders are chosen and modeled using two sets of acoustic parameters: 1) relative phase shift and 2) canonical Mel Frequency Cepstral Coefficients (MFCC) parameters, as baseline. The vocoder dependency of the system and multivocoder modeling features are thoroughly studied. Additional phase-aware vocoders are also tested. Several experiments are carried out, showing that the phase-based parameters perform better and are able to cope with new unknown attacks. The final evaluations, testing synthetic TTS signals obtained from the Blizzard challenge, validate our proposal.
Keywords :
Gaussian processes; biometrics (access control); mixture models; speech synthesis; vocoders; GMM; Gaussian mixture model based binary classifier; MFCC; Wall Street Journal database; acoustic parameters:; blizzard challenge; canonical mel frequency cepstral coefficients parameters; copy-synthesized signals; phase information; phase-aware vocoders; relative phase shift; speaker verification; speech driven biometric access system; standard SV system; statistical text to speech systems; synthetic TTS signals; synthetic speech detector; synthetic voice; universal synthetic speech spoofing detection; Databases; Harmonic analysis; Mel frequency cepstral coefficient; Speech; Speech synthesis; Training; Vocoders; BIO-MODA-VOI; Voice biometrics; anti-spoofing; phase information; synthetic speech detection; voice biometrics;
fLanguage :
English
Journal_Title :
Information Forensics and Security, IEEE Transactions on
Publisher :
ieee
ISSN :
1556-6013
Type :
jour
DOI :
10.1109/TIFS.2015.2398812
Filename :
7029029
Link To Document :
بازگشت