Title :
Exploiting Psychological Factors for Interaction Style Recognition in Spoken Conversation
Author :
Wen-Li Wei ; Chung-Hsien Wu ; Jen-Chun Lin ; Han Li
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
Abstract :
Determining how a speaker is engaged in a conversation is crucial for achieving harmonious interaction between computers and humans. In this study, a fusion approach was developed based on psychological factors to recognize Interaction Style ( IS) in spoken conversation, which plays a key role in creating natural dialogue agents. The proposed Fused Cross-Correlation Model (FCCM) provides a unified probabilistic framework to model the relationships among the psychological factors of emotion, personality trait ( PT), transient IS, and IS history, for recognizing IS. An emotional arousal-dependent speech recognizer was used to obtain the recognized spoken text for extracting linguistic features to estimate transient IS likelihood and recognize PT. A temporal course modeling approach and an emotional sub-state language model, based on the temporal phases of an emotional expression, were employed to obtain a better emotion recognition result. The experimental results indicate that the proposed FCCM yields satisfactory results in IS recognition and also demonstrate that combining psychological factors effectively improves IS recognition accuracy.
Keywords :
correlation methods; emotion recognition; feature extraction; probability; speaker recognition; FCCM; emotion recognition; emotional arousal-dependent speech recognizer; emotional sub-state language model; fused cross-correlation model; fusion approach; harmonious interaction; interaction style recognition; linguistic feature extraction; natural dialogue agents; psychological factors; spoken conversation; temporal course modeling approach; temporal phases; transient IS likelihood; unified probabilistic framework; Accuracy; Emotion recognition; Feature extraction; Psychology; Speech; Speech recognition; Text recognition; Emotion; interaction style; language model; personality trait; temporal phase;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2300339