• DocumentCode
    1324522
  • Title

    Speech Emotion Analysis: Exploring the Role of Context

  • Author

    Tawari, Ashish ; Trivedi, Mohan Manubhai

  • Author_Institution
    Comput. Vision & Robot. Res. Lab., Univ. of California at San Diego, La Jolla, CA, USA
  • Volume
    12
  • Issue
    6
  • fYear
    2010
  • Firstpage
    502
  • Lastpage
    509
  • Abstract
    Automated analysis of human affective behavior has attracted increasing attention in recent years. With the research shift toward spontaneous behavior, many challenges have come to surface ranging from database collection strategies to the use of new feature sets (e.g., lexical cues apart from prosodic features). Use of contextual information, however, is rarely addressed in the field of affect expression recognition, yet it is evident that affect recognition by human is largely influenced by the context information. Our contribution in this paper is threefold. First, we introduce a novel set of features based on cepstrum analysis of pitch and intensity contours. We evaluate the usefulness of these features on two different databases: Berlin Database of emotional speech (EMO-DB) and locally collected audiovisual database in car settings (CVRRCar-AVDB). The overall recognition accuracy achieved for seven emotions in the EMO-DB database is over 84% and over 87% for three emotion classes in CVRRCar-AVDB. This is based on tenfold stratified cross validation. Second, we introduce the collection of a new audiovisual database in an automobile setting (CVRRCar-AVDB). In this current study, we only use the audio channel of the database. Third, we systematically analyze the effects of different contexts on two different databases. We present context analysis of subject and text based on speaker/text-dependent/-independent analysis on EMO-DB. Furthermore, we perform context analysis based on gender information on EMO-DB and CVRRCar-AVDB. The results based on these analyses are promising.
  • Keywords
    cepstral analysis; emotion recognition; speech processing; Berlin database; EMO-DB database; affect expression recognition; audiovisual database; context analysis; human affective behavior; intensity contours; pitch cepstrum analysis; speaker independent analysis; speech emotion analysis; stratified cross validation; text dependent analysis; Accuracy; Context; Databases; Emotion recognition; Feature extraction; Speech; Speech recognition; Affect analysis; affective computing; context analysis; emotion intelligence; emotion recognition; emotional speech; vocal expression;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2010.2058095
  • Filename
    5571815