Title :
An attribute-based approach to audio description applied to segmenting vocal sections in popular music songs
Author :
Sundaram, Shiva ; Narayanan, Shrikanth
Author_Institution :
Dept. of Electr. Eng.-Syst., Univ. of Southern California, Los Angeles, CA
Abstract :
We present a descriptive approach for analyzing audio scenes that can comprise a mixture of audio sources. We apply this method to segment popular music songs into vocal and non-vocal sections. Unlike existing methods that directly rely on within-class feature similarities of acoustic sources, the proposed data-driven system is based on a training set where the acoustic sources are grouped by their perceptual or semantic attributes. Our audio analysis approach is based on a quantitative time-varying metric to measure the interaction between acoustic sources present in a scene developed using pattern recognition methods. Using the proposed system that is trained on a general sound effects library, we achieve less than ten percent vocal-section segmentation error and less than five percent false alarm rates when evaluated on a database of popular music recordings that spans four different genres (rock, hiphop, pop, and easy listening)
Keywords :
audio databases; audio signal processing; musical acoustics; pattern recognition; time-varying systems; audio sources; data-driven system; music recordings database; music songs; pattern recognition methods; time-varying metric; vocal sections segmentation; Acoustic measurements; Audio recording; Databases; Laboratories; Layout; Libraries; Music information retrieval; Pattern analysis; Pattern recognition; Speech analysis;
Conference_Titel :
Multimedia Signal Processing, 2006 IEEE 8th Workshop on
Conference_Location :
Victoria, BC
Print_ISBN :
0-7803-9751-7
Electronic_ISBN :
0-7803-9752-5
DOI :
10.1109/MMSP.2006.285277