Title :
Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering
Author :
Zhang, Xin ; Jiang, Wenxin ; Ras, Zbigniew W.
Author_Institution :
Univ. of North Carolina, Pembroke, NC
Abstract :
Automatic indexing of music by instruments and their types is a challenging problem, especially when multiple instruments are playing at the same time. We have built a database containing more than one million of music instrument sounds, each described by a large number o features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by our team. Our previous research results show that all these features only lead to classifiers which successfully identify music instruments in monophonic music (only one instrument playing at a time). Their confidence for polyphonic music is much lower. This brought the need for blind sound source separation algorithms. In this paper, we present a new spectrum clustering enhanced method which improves the estimation of fundamental frequency as well as the balance of the categorization tree of training datasets, and therefore enhances the precision of automatic indexing. The system is recursively detecting the pitch of the predominant sound source, then calculates the features based on the estimated pitch, and then predicts the most similar spectrum by the corresponding classification tree, and finally subtracts the estimated predominant spectrum until silence is detected.
Keywords :
audio coding; audio databases; blind source separation; database indexing; frequency estimation; learning (artificial intelligence); music; pattern clustering; signal classification; spectral analysis; speech recognition; tree data structures; audio feature; automatic music indexing; blind sound source separation algorithm; dataset training; fundamental frequency estimation; harmonic blind sound source isolation; music instrument; spectrum clustering; speech recognition; standard MPEG7 audio descriptor; tree categorization; Audio databases; Frequency estimation; Instruments; MPEG 7 Standard; Machine assisted indexing; Music; Recursive estimation; Spatial databases; Speech recognition; Standards development; Automatic Indexing; Blind Sound Source Separation; Clustering; Music Information Retrieval; Sound Features;
Conference_Titel :
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3503-6
Electronic_ISBN :
978-0-7695-3503-6
DOI :
10.1109/ICDMW.2008.67