Title :
Musical Source Clustering and Identification in Polyphonic Audio
Author :
Arora, Vipul ; Behera, Laxmidhar
Author_Institution :
Dept. of Electr. Eng., Indian Inst. of Technol., Kanpur, Kanpur, India
Abstract :
For music transcription or musical source separation, apart from knowing the multi-F0 contours, it is also important to know which F0 has been played by which instrument. This paper focuses on this aspect, i.e. given the polyphonic audio along with its multiple F0 contours, the proposed system clusters them so as to decide `which instrument played when.´ For the task of identifying the instrument or singers in the polyphonic audio, there are many supervised methods available. But many times individual source audio is not available for training. To address this problem, this paper proposes novel schemes using semi-supervised as well as unsupervised approach to source clustering. The proposed theoretical framework is based on auditory perception theory and is implemented using various tools like probabilistic latent component analysis and graph clustering, while taking into account various perceptual cues for characterizing a source. Experiments have been carried out over a wide variety of datasets - ranging from vocal to instrumental as well as from synthetic to real world music. The proposed scheme significantly outperforms a state of the art unsupervised scheme, which does not make use of the given F0 contours. The proposed semi-supervised approach also performs better than another semi-supervised scheme, which makes use of the given F0 information, in terms of computations as well as accuracy.
Keywords :
acoustic signal processing; audio signal processing; graph theory; information retrieval; music; musical instruments; pattern clustering; probability; statistical analysis; unsupervised learning; acoustic scene analysis; auditory perception theory; graph clustering; multiF0 contours; music information retrieval; music transcription; musical source clustering; musical source identification; musical source separation; polyphonic audio; polyphonic instrument identification; probabilistic latent component analysis; semi-supervised approach; supervised methods; unsupervised approach; Feature extraction; Harmonic analysis; IEEE transactions; Indexes; Instruments; Speech; Speech processing; Acoustic scene analysis; music information retrieval; polyphonic instrument identification;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2313404