Synthesizing emotions in speech: is it time to get excited?

Author

Murray, Iain R. ; Arnott, John L.

Author_Institution

The MicroCentre, Dundee Univ., UK

Volume

3

fYear

1996

fDate

3-6 Oct 1996

Firstpage

1816

Abstract

Modern speech synthesis systems with very high intelligibility are readily available in a number of languages. However, the output from all present systems is still readily identifiable as being machine generated-the output does not sound “natural”. One aspect of naturalness is the variability introduced by the emotional state of the speaker, and related pragmatic effects; no current commercial systems include such variation. Comparatively little work has been done to investigate how a speaker´s emotional state creates variation in the speech signal, and this work has traditionally been performed by psychologists and has remained distinct from mainstream speech science. Current research suggests that there will be considerable effort involved in producing any accurate description of pragmatic variations in speech, but there has recently been increasing interest in this area due to potential applications in many branches of speech technology. The paper describes a prototype system which has been constructed to simulate emotion in speech synthesized by rule. The system is based on emotion information from the literature, and it simulates a range of emotions using a commercial synthesiser. The use of emotion models and their applicability in the area of speech technology is discussed. The limitations of our current knowledge in the area of vocal emotion are discussed, and suggestions are presented for future research in this area

Keywords

human factors; psychology; speech intelligibility; speech synthesis; commercial synthesiser; emotion information; emotion models; emotion simulation; emotion synthesis; emotional state; high intelligibility; machine generated; naturalness; prototype system; psychologists; speech signal; speech synthesis systems; speech technology; vocal emotion; Humans; Loudspeakers; Modems; Mood; Natural languages; Production systems; Psychology; Speech recognition; Speech synthesis; Virtual prototyping;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location

Philadelphia, PA

Print_ISBN

0-7803-3555-4

Type

conf

DOI

10.1109/ICSLP.1996.607983

Filename

607983