DocumentCode :
2419288
Title :
User-driven recognition of audio events in news videos
Author :
Giannakopoulos, Theodoros ; Petridis, Sergios ; Perantonis, Stavros
Author_Institution :
Inst. of Inf. & Telecommun., Comput. Intell. Lab., Nat. Center for Sci. Res. Demokritos, Greece
fYear :
2010
fDate :
9-10 Dec. 2010
Firstpage :
44
Lastpage :
49
Abstract :
We propose a method for user-driven recognition of events in audio streams, aiming to assist journalists towards easily annotate unedited audiovisual content. Nonlocal information provided by the user, as for example that the sound of applause exists within the video, is used for adapting the audio event classifiers so as to detect the exact position of these events in the video. Towards this end, each audio class is modeled using a Support Vector Machine (SVM) and the final automatic decision is taken on a mid-term audio basis, using an alternative of the One Vs All architecture. A weighting function is generated based on the user input and it is applied on the soft-output decision of the respective SVMs, thus adapting the final decision to the user´s provided knowledge. To evaluate our method, we have used a large dataset of real news videos, provided by the German international broadcaster (DW - Deutsche Welle) and the Portugese broadcaster (Lusa - Agłncia de Notcias de Portuga) where five audio classes, often met in the particular dataset, are defined. Results show that the above process leads to significant raise of the audio tracking performance.
Keywords :
audio signal processing; audio streaming; decision making; image classification; support vector machines; video signal processing; German international broadcaster; Portugese broadcaster; audio class; audio event classifier; audio tracking; automatic decision; journalists assist; midterm audio basis; news video; nonlocal information; one vs all architecture; soft output decision; support vector machine; unedited audiovisual content; user driven recognition; weighting function; Entropy; Event detection; Feature extraction; Mathematical model; Speech; Support vector machines; Videos;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Media Adaptation and Personalization (SMAP), 2010 5th International Workshop on
Conference_Location :
Limmassol
Print_ISBN :
978-1-4244-8603-8
Electronic_ISBN :
978-1-4244-8601-4
Type :
conf
DOI :
10.1109/SMAP.2010.5706867
Filename :
5706867
Link To Document :
بازگشت