DocumentCode :
2972898
Title :
Speaker de-identification via voice transformation
Author :
Jin, Qin ; Toth, Arthur R. ; Schultz, Tanja ; Black, Alan W.
Author_Institution :
Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
2009
fDate :
Nov. 13 2009-Dec. 17 2009
Firstpage :
529
Lastpage :
533
Abstract :
It is a common feature of modern automated voice-driven applications and services to record and transmit a user´s spoken request. At the same time, several domains and applications may require keeping the content of the user´s request confidential and at the same time preserving the speaker´s identity. This requires a technology that allows the speaker´s voice to be de-identified in the sense that the voice sounds natural and intelligible but does not reveal the identity of the speaker. In this paper we investigate different voice transformation strategies on a large population of speakers to disguise the speakers´ identities while preserving the intelligibility of the voices. We apply two automatic speaker identification approaches to verify the success of de-identification with voice transformation, a GMM-based and a phonetic approach. The evaluation based on the automatic speaker identification systems verifies that the proposed voice transformation technique enables transmission of the content of the users´ spoken requests while successfully preserving their identities. Also, the results indicate that different speakers still sound distinct after the transformation. Furthermore, we carried out a human listening test that proved the transformed speech to be both intelligible and securely de-identified, as it hid the identity of the speakers even to listeners who knew the speakers very well.
Keywords :
Gaussian processes; speaker recognition; speech intelligibility; Gaussian mixture model; automated voice-driven applications; automatic speaker identification approach; human listening test; phonetic approach; speaker de-identification; voice intelligibility; voice transformation technique; Cepstral analysis; Humans; Information security; Loudspeakers; Natural languages; Privacy; Speech coding; Speech recognition; System testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
Type :
conf
DOI :
10.1109/ASRU.2009.5373356
Filename :
5373356
Link To Document :
بازگشت