DocumentCode :
134326
Title :
Reconstruction of pitch for whisper-to-speech conversion of Chinese
Author :
Jingjie Li ; McLoughlin, Ian Vince ; Yan Song
Author_Institution :
Nat. Eng. Lab. of Speech & Language Inf. Process., Univ. of Sci. & Technol. of China, Hefei, China
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
206
Lastpage :
210
Abstract :
Whispers are a common and necessary secondary vocal communications mechanism for natural human-to-human dialogue. They are also the primary communications mechanism for many suffering from aphonia, such as laryngectomees. For typical speakers, whispering is a predominantly contextual activity, prompted by either the sensitive nature of information being conveyed or in response to environmental considerations. Given the importance of whispers, especially for tonal languages like Chinese, and the fact that many communications systems assume vocalised speech, much work has been directed towards the conversion of whispers into natural sounding speech. Since pitch information is largely absent in whispers, it is this key f0 information which needs to be supplied during the regeneration process, and which is the focus of much research. GMM-based reconstruction techniques have proven effective at whisper reconstruction, and some recent work has proposed the use of artificial pitch derived from formant harmonics as an alternative. This paper describes a new formulation of the formant-harmonic f0 method, and compares this directly against a novel GMM-based f0 estimator, as well as known correct pitch excitation for parallel utterances.
Keywords :
natural language processing; speech processing; Chinese; GMM-based reconstruction techniques; aphonia; artificial pitch; formant-harmonic f0 method; laryngectomees; natural human-to-human dialogue; natural sounding speech; parallel utterances; pitch reconstruction; primary communications mechanism; regeneration process; secondary vocal communications mechanism; tonal languages; vocalised speech; whisper conversion; whisper reconstruction; whisper-to-speech conversion; Cepstral analysis; Feature extraction; Joints; Modulation; Speech; Speech processing; Vectors; GMM; Whisper speech; speech reconstruction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936709
Filename :
6936709
Link To Document :
بازگشت