DocumentCode :
1414246
Title :
Towards Timbre-Invariant Audio Features for Harmony-Based Music
Author :
Müller, Meinard ; Ewert, Sebastian
Author_Institution :
Max-Planck Inst. fur Inf., Saarland Univ., Saarbrucken, Germany
Volume :
18
Issue :
3
fYear :
2010
fDate :
3/1/2010 12:00:00 AM
Firstpage :
649
Lastpage :
662
Abstract :
Chroma-based audio features are a well-established tool for analyzing and comparing harmony-based Western music that is based on the equal-tempered scale. By identifying spectral components that differ by a musical octave, chroma features possess a considerable amount of robustness to changes in timbre and instrumentation. In this paper, we describe a novel procedure that further enhances chroma features by significantly boosting the degree of timbre invariance without degrading the features´ discriminative power. Our idea is based on the generally accepted observation that the lower mel-frequency cepstral coefficients (MFCCs) are closely related to timbre. Now, instead of keeping the lower coefficients, we discard them and only keep the upper coefficients. Furthermore, using a pitch scale instead of a mel scale allows us to project the remaining coefficients onto the 12 chroma bins. We present a series of experiments to demonstrate that the resulting chroma features outperform various state-of-the art features in the context of music matching and retrieval applications. As a final contribution, we give a detailed analysis of our enhancement procedure revealing the musical meaning of certain pitch-frequency cepstral coefficients.
Keywords :
audio signal processing; cepstral analysis; music; audio matching; equal-tempered scale; harmony-based music; mel-frequency cepstral coefficients; pitch feature; timbre-invariant audio features; Art; Boosting; Cepstral analysis; Degradation; Discrete cosine transforms; Instruments; Mel frequency cepstral coefficient; Music information retrieval; Robustness; Timbre; Audio matching; chroma feature; mel-frequency cepstral coefficient (MFCC); music retrieval; pitch feature; timbre-invariance;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2041394
Filename :
5410051
Link To Document :
بازگشت