DocumentCode :
3406053
Title :
Audio-based affect detection in web videos
Author :
Chisholm, Dave ; Siddiquie, Behjat ; Divakaran, Ajay ; Shriberg, Elizabeth
Author_Institution :
SRI Int., Princeton, NJ, USA
fYear :
2015
fDate :
June 29 2015-July 3 2015
Firstpage :
1
Lastpage :
6
Abstract :
We present a new technique for detecting audio concepts in web content as well outline the technique´s applications to video sequence parsing. Our focus is primarily on affective concepts and in order to study them we have collected a new dataset, consisting of videos where a speaker is persuading a crowd, called “Rallying a Crowd”. We develop new classifiers for graded levels of arousal in speech as well as crowd noise and music and demonstrate their effectiveness on web content. These techniques achieve high detection accuracy (58.2%) for affective concepts on this new dataset and outperform (36.8%) state-of-the-art techniques (33.1%) for semantic concepts on a previously collected dataset. We also develop a new audio sequence segmentation technique which enables us to rapidly classify subsections of test sequence audio into the aforementioned audio classes. We are thus able to robustly address the detection of affective concepts in highly variable web content as well as the computational challenge of quick classification so as to enable web scale processing.
Keywords :
Internet; audio signal processing; video signal processing; Web content; Web scale processing; Web videos; audio concept detection; audio sequence segmentation technique; audio-based affect detection; crowd noise; video sequence parsing; Feature extraction; Kernel; Mel frequency cepstral coefficient; Semantics; Speech; Support vector machines; Videos; Affect detection; Audio concept detection; Audio segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo (ICME), 2015 IEEE International Conference on
Conference_Location :
Turin
Type :
conf
DOI :
10.1109/ICME.2015.7177525
Filename :
7177525
Link To Document :
بازگشت