DocumentCode :
1958400
Title :
Spectral mismatch as the index of quality of naturalness in synthetic speech
Author :
Kawachale, S.P. ; Gengaje, S.R. ; Chitode, J.S.
Author_Institution :
Dept. of E&TC, M.I.T., Pune, India
fYear :
2009
fDate :
23-26 Aug. 2009
Firstpage :
808
Lastpage :
813
Abstract :
It is extremely tough to make a machine which sounds identical to human. Hence the best text to speech (TTS) algorithm ever made sounds robotic, until and unless human speech itself is involved in it. But it is not possible to create a database of each and every word possible in any language. Syllable based concatenative speech synthesis (CSS) leads to formation of new words from existing words in data base. Improper concatenation with respect to position of the syllable leads to spectral mismatch. A first step to overcome this is to estimate spectral mismatch with respect to position of the syllable. We propose a method based on power spectral density (PSD) to estimate position dependent spectral mismatch. This can be done by plotting power spectral density of 10 millisecond samples of original, properly concatenated (PC) and improperly concatenated (IC) words. These samples are then made noise free to neglect their low amplitude peaks. Analysis of PSD leads to locate formants in the given samples. Formants for original, properly and improperly concatenated words is then plotted. It is observed that formant plots for original and properly concatenated words are very similar for all formants while for improper concatenation extra peaks are observed in all formants. These extra peaks can be considered as estimation for spectral mismatch. The results are validated using Marathi text to speech synthesis.
Keywords :
speech synthesis; concatenative speech synthesis; power spectral density; spectral mismatch; synthetic speech; text-to-speech algorithm; Acoustic noise; Cascading style sheets; Concatenated codes; Databases; Frequency; Humans; Magnetic heads; Robots; Speech analysis; Speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications, Computers and Signal Processing, 2009. PacRim 2009. IEEE Pacific Rim Conference on
Conference_Location :
Victoria, BC
Print_ISBN :
978-1-4244-4560-8
Electronic_ISBN :
978-1-4244-4561-5
Type :
conf
DOI :
10.1109/PACRIM.2009.5291267
Filename :
5291267
Link To Document :
بازگشت