DocumentCode
2014889
Title
Audio-visual vibraphone transcription in real time
Author
Tavares, Tiago F. ; Odowichuck, Gabrielle ; Zehtabi, Sonmaz ; Tzanetakis, George
Author_Institution
Dept. of Comput. Sci., Univ. of Victoria, Victoria, BC, Canada
fYear
2012
fDate
17-19 Sept. 2012
Firstpage
215
Lastpage
220
Abstract
Music transcription refers to the process of detecting musical events (typically consisting of notes, starting times and durations) from an audio signal. Most existing work in automatic music transcription has focused on offline processing. In this work we describe our efforts in building a system for real time music transcription for the vibraphone. We describe experiments with three audio-based methods for music transcription that are representative of the state of the art. One method is based on multiple pitch estimation and the other two methods are based on factorization of the audio spectrogram. In addition we show how information from a video camera can be used to impose constraints on the symbol search space based on the gestures of the performer. Experimental results with various system configurations show that this multi-modal approach leads to a significant reduction of false positives and increases the overall accuracy. This improvement is observed for all three audio methods, and indicates that visual information is complimentary to the audio information in this context.
Keywords
audio signal processing; audio-visual systems; estimation theory; music; real-time systems; search problems; video cameras; video signal processing; audio information; audio methods; audio signal; audio spectrogram factorization; audio-based methods; audio-visual vibraphone transcription; automatic music transcription; false positives; multimodal approach; multiple pitch estimation; musical event detection; offline processing; real time music transcription; symbol search space; system configurations; video camera; visual information; Acoustics; Algorithm design and analysis; Cameras; Computer vision; Harmonic analysis; Instruments; Noise; Audiovisual; Music; Transcription;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Signal Processing (MMSP), 2012 IEEE 14th International Workshop on
Conference_Location
Banff, AB
Print_ISBN
978-1-4673-4570-5
Electronic_ISBN
978-1-4673-4571-2
Type
conf
DOI
10.1109/MMSP.2012.6343443
Filename
6343443
Link To Document