• DocumentCode
    3713026
  • Title

    Keynote speech 4: Extraction of linguistic and paralinguistic information from audio-visual data

  • Author

    Shrikanth Narayanan

  • Author_Institution
    Univ. of Southern California, Los Angeles, CA, USA
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    2
  • Abstract
    Audio-visual data have been a key enabler of human observational research and practice. The confluence of sensing, communication and computing technologies is allowing capture and access to data, in diverse forms and modalities, in ways that were unimaginable even a few years ago. Importantly, these data afford the analysis and interpretation of multimodal cues of verbal and non-verbal human behavior. These signals carry crucial information about not only a person´s intent and identity but also underlying attitudes and emotions. Automatically capturing these cues, although vastly challenging, offers the promise of not just efficient data processing but in tools for discovery that enable hitherto unimagined insights. Recent computational approaches that have leveraged judicious use of both data and knowledge have yielded significant advances in this regards, for example in deriving rich information from multimodal sources including human speech, language, and videos of visual behavior. This talk will focus on some of the advances and challenges in gathering such data and creating algorithms for machine processing of such cues. It will also introduce some of the freely available data resources for research.
  • Publisher
    ieee
  • Conference_Titel
    Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015 International Conference
  • Type

    conf

  • DOI
    10.1109/ICSDA.2015.7357854
  • Filename
    7357854