DocumentCode
1893373
Title
Browsing videos by automatically detected audio events
Author
Barbosa, Virgínia ; Pellegrini, T. ; Bugalho, M. ; Trancoso, Isabel
Author_Institution
IST, UTL, Lisbon, Portugal
fYear
2011
fDate
27-29 April 2011
Firstpage
1
Lastpage
4
Abstract
This paper focuses on Audio Event Detection (AED), a research area which aims to substantially enhance the access to audio in multimedia content. With the ever-growing quantity of multimedia documents uploaded on the Web, automatic description of the audio content of videos can provide very useful information, to index, archive and search multimedia documents. Preliminary experiments with a sound effects corpus showed good results for training models. However, the performance on the real data test set, where there are overlapping audio events and continuous background noise is lower. This paper describes the AED framework and methodologies used to build 6 Audio Event detectors, based on statistical machine learning tools (Support Vector Machines). The detectors showed some promising improvements achieved by adding background noises to the training data, comprised of clean sound effects that are quite different from the real audio events in real life videos and movies. A graphical interface prototype is also presented, that allows browsing a movie by its content and provides an audio event description with time codes.
Keywords
audio signal processing; cinematography; multimedia communication; statistical analysis; support vector machines; video retrieval; video signal processing; AED framework; World Wide Web; audio access; audio event description; audio event detection; clean sound effect; continuous background noise; graphical interface prototype; movie browsing; multimedia content; multimedia document archive; multimedia document index; multimedia document search; overlapping audio event; real audio event; real life movies; real life video; sound effect corpus; statistical machine learning tool; support vector machine; time code; video audio content; video browsing; Detectors; Event detection; Feature extraction; Motion pictures; Noise measurement; Speech; Videos;
fLanguage
English
Publisher
ieee
Conference_Titel
EUROCON - International Conference on Computer as a Tool (EUROCON), 2011 IEEE
Conference_Location
Lisbon
Print_ISBN
978-1-4244-7486-8
Type
conf
DOI
10.1109/EUROCON.2011.5929358
Filename
5929358
Link To Document