• DocumentCode
    1893373
  • Title

    Browsing videos by automatically detected audio events

  • Author

    Barbosa, Virgínia ; Pellegrini, T. ; Bugalho, M. ; Trancoso, Isabel

  • Author_Institution
    IST, UTL, Lisbon, Portugal
  • fYear
    2011
  • fDate
    27-29 April 2011
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper focuses on Audio Event Detection (AED), a research area which aims to substantially enhance the access to audio in multimedia content. With the ever-growing quantity of multimedia documents uploaded on the Web, automatic description of the audio content of videos can provide very useful information, to index, archive and search multimedia documents. Preliminary experiments with a sound effects corpus showed good results for training models. However, the performance on the real data test set, where there are overlapping audio events and continuous background noise is lower. This paper describes the AED framework and methodologies used to build 6 Audio Event detectors, based on statistical machine learning tools (Support Vector Machines). The detectors showed some promising improvements achieved by adding background noises to the training data, comprised of clean sound effects that are quite different from the real audio events in real life videos and movies. A graphical interface prototype is also presented, that allows browsing a movie by its content and provides an audio event description with time codes.
  • Keywords
    audio signal processing; cinematography; multimedia communication; statistical analysis; support vector machines; video retrieval; video signal processing; AED framework; World Wide Web; audio access; audio event description; audio event detection; clean sound effect; continuous background noise; graphical interface prototype; movie browsing; multimedia content; multimedia document archive; multimedia document index; multimedia document search; overlapping audio event; real audio event; real life movies; real life video; sound effect corpus; statistical machine learning tool; support vector machine; time code; video audio content; video browsing; Detectors; Event detection; Feature extraction; Motion pictures; Noise measurement; Speech; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    EUROCON - International Conference on Computer as a Tool (EUROCON), 2011 IEEE
  • Conference_Location
    Lisbon
  • Print_ISBN
    978-1-4244-7486-8
  • Type

    conf

  • DOI
    10.1109/EUROCON.2011.5929358
  • Filename
    5929358