• DocumentCode
    1668604
  • Title

    Multimodal analysis of speech prosody and upper body gestures using hidden semi-Markov models

  • Author

    Bozkurt, E. ; Asta, Shahriar ; Ozkul, Serkan ; Yemez, Y. ; Erzin, E.

  • Author_Institution
    Multimedia, Vision & Graphics Lab., Koc Univ., Istanbul, Turkey
  • fYear
    2013
  • Firstpage
    3652
  • Lastpage
    3656
  • Abstract
    Gesticulation is an essential component of face-to-face communication, and it contributes significantly to the natural and affective perception of human-to-human communication. In this work we investigate a new multimodal analysis framework to model relationships between intonational and gesture phrases using the hidden semi-Markov models (HSMMs). The HSMM framework effectively associates longer duration gesture phrases to shorter duration prosody clusters, while maintaining realistic gesture phrase duration statistics. We evaluate the multimodal analysis framework by generating speech prosody driven gesture animation, and employing both subjective and objective metrics.
  • Keywords
    gesture recognition; hidden Markov models; modal analysis; speech recognition; HSMM; affective perception; face-to-face communication; gesticulation; gesture animation; hidden semiMarkov models; human-to-human communication; intonational phrase; multimodal analysis; natural perception; prosody clusters; realistic gesture phrase duration statistics; speech prosody; upper body gestures; Analytical models; Animation; Feature extraction; Hidden Markov models; Joints; Mathematical model; Speech; Prosody analysis; gesture animation; gesture segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6638339
  • Filename
    6638339