Title :
Automatic transcription of drum sequences using audiovisual features
Author :
Gillet, Olivier ; Richard, Gaël
Author_Institution :
Signal & Image Process. Dept., Telecom Paris, France
Abstract :
The transcription of a musical performance from the audio signal is often problematic, either because it requires the separation of complex sources, or simply because some important high-level music information cannot be directly extracted from the audio signal. We propose a novel multimodal approach for the transcription of drum sequences using audiovisual features. The transcription is performed by support vector machine (SVM) classifiers, and three different information fusion strategies are evaluated. A correct recognition rate of 85.8% can be achieved for a detailed taxonomy and a fully automated transcription.
Keywords :
audio signal processing; audio-visual systems; music; pattern classification; sensor fusion; sequences; signal classification; support vector machines; video signal processing; SVM classifiers; audio signal; audiovisual features; automatic music transcription; complex source separation; drum sequences; high-level music information; information fusion strategies; multimodal approach; support vector machine classifiers; video signal; Audio recording; Data mining; Independent component analysis; Instruments; Layout; Machine assisted indexing; Multiple signal classification; Signal processing; Support vector machine classification; Support vector machines;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Print_ISBN :
0-7803-8874-7
DOI :
10.1109/ICASSP.2005.1415682