Title :
LP and TD-PSOLA-based incorporation of happiness in neutral speech using time-domain parameters
Author :
Sreenidhi, S. ; Rachel, G. Anushiya ; Vijayalakshmi, P. ; Nagarajan, T.
Author_Institution :
SSN Coll. of Eng., Chennai, India
Abstract :
Emotions express a person´s internal state of being and it is reflected in the speech utterances. Emotions affect the time-domain characteristics of the speech signal, namely intonation patterns, speech rate, and short-term energy function. Conventional text-to-speech (TTS) systems are built to produce speech utterances for a given text, without any emotion, which can be called as neutral speech. Building a TTS system which can produce speech utterances with expected emotion is not a trivial task, in the sense that for each of the emotions, a separate speech corpus should be carefully collected and the system should be built. Therefore, the current work focuses on incorporating happiness into neutral speech using signal processing algorithms. In this regard, neutral and happy speech are analyzed and it is found that happiness can be perceived in certain emotive words in a sentence. Thus, in order to introduce happiness into neutral speech, these emotive keywords are identified and the above mentioned time-domain parameters are modified. Linear prediction-based synthesis of happy speech is initially performed. To improve the quality of the synthesized speech, TD-PSOLA is then used. Subjective evaluation yields a mean opinion score of 2.05 (out of a maximum of 3) for happy speech synthesized using linear prediction and 2.53 for those synthesized using TD-PSOLA.
Keywords :
speech processing; speech synthesis; time-domain analysis; LP; TD-PSOLA; TTS system; emotive keywords; happy speech; intonation patterns; linear prediction-based synthesis; neutral speech; person internal state; short-term energy function; signal processing algorithms; speech corpus; speech rate; speech signal; speech utterances; text-to-speech systems; time-domain parameters; time-domain pitch synchronous overlap-add-based synthesis techniques; Computers; Polynomials; Spectrogram; Speech; Speech synthesis; Time-domain analysis; TD-PSOLA; happiness incorporation; linear prediction; neutral speech; pitch contour; short-term energy;
Conference_Titel :
Circuit, Power and Computing Technologies (ICCPCT), 2014 International Conference on
Print_ISBN :
978-1-4799-2395-3
DOI :
10.1109/ICCPCT.2014.7054931