DocumentCode :
2396125
Title :
Detection on PSOLA-modified voices by seeking out duplicated fragments
Author :
Shen, Yifeng ; Jia, Jia ; Cai, Lianhong
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2012
fDate :
19-20 May 2012
Firstpage :
2177
Lastpage :
2182
Abstract :
Pitch Synchronous Overlap-Add (PSOLA) refers to a family of signal processing techniques widely used for prosodic modification. They can be used to modify one person´s voice by altering prosodic characteristics of speech, making the voice unrecognizable or unidentifiable. Well-modified voices may even make the speaker recognition process, which is critical in digital audio forensic framework, out of work. Time-domain PSOLA (TD-PSOLA) is the most popular algorithm in PSOLA family. Time- and pitch-scaling form of modifications can be applied by TD-PSOLA, and the synthesis quality is extremely high provided that the modifications do not exceed a factor of two. Our paper presents a simple method to figure out whether a given speech waveform is modified or not by the TD-PSOLA algorithm. Seeking out duplicated fragments from time domain of the waveform, we extract the occurrence number of duplicated fragments as well as occurrence frequency in voiced portions of speech. A single feature (duplicated fragments density, DFD) is then calculated, and compared with a threshold (obtained from plenty of former statistic results) to decide whether the questioned speech waveform is modified. Experimental results demonstrate the effectiveness of our method in detecting modified voices, which are pitch heightened and/or duration lengthened using the TD-PSOLA algorithm.
Keywords :
speaker recognition; PSOLA modified voice detection; duplicated fragments; occurrence frequency; pitch synchronous overlap add; seeking out duplicated fragments; signal processing techniques; speaker recognition process; speech voiced portions; speech waveform; voice unidentifiable; voice unrecognizable; Feature extraction; Forensics; Signal processing algorithms; Speech; Speech processing; Timbre; Time domain analysis; Digital Audio Forensic; Duplicated Fragments; PSOLA; Speech Processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location :
Yantai
Print_ISBN :
978-1-4673-0198-5
Type :
conf
DOI :
10.1109/ICSAI.2012.6223483
Filename :
6223483
Link To Document :
بازگشت