DocumentCode
3713026
Title
Keynote speech 4: Extraction of linguistic and paralinguistic information from audio-visual data
Author
Shrikanth Narayanan
Author_Institution
Univ. of Southern California, Los Angeles, CA, USA
fYear
2015
Firstpage
1
Lastpage
2
Abstract
Audio-visual data have been a key enabler of human observational research and practice. The confluence of sensing, communication and computing technologies is allowing capture and access to data, in diverse forms and modalities, in ways that were unimaginable even a few years ago. Importantly, these data afford the analysis and interpretation of multimodal cues of verbal and non-verbal human behavior. These signals carry crucial information about not only a person´s intent and identity but also underlying attitudes and emotions. Automatically capturing these cues, although vastly challenging, offers the promise of not just efficient data processing but in tools for discovery that enable hitherto unimagined insights. Recent computational approaches that have leveraged judicious use of both data and knowledge have yielded significant advances in this regards, for example in deriving rich information from multimodal sources including human speech, language, and videos of visual behavior. This talk will focus on some of the advances and challenges in gathering such data and creating algorithms for machine processing of such cues. It will also introduce some of the freely available data resources for research.
Publisher
ieee
Conference_Titel
Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015 International Conference
Type
conf
DOI
10.1109/ICSDA.2015.7357854
Filename
7357854
Link To Document