DocumentCode
763444
Title
Modeling timbre distance with temporal statistics from polyphonic music
Author
Mörchen, Fabian ; Ultsch, Alfred ; Thies, Michael ; Löhken, Ingo
Author_Institution
Data Bionics Res. Group, Philipps Univ., Marburg, Germany
Volume
14
Issue
1
fYear
2006
Firstpage
81
Lastpage
90
Abstract
Timbre distance and similarity are expressions of the phenomenon that some music appears similar while other songs sound very different to us. The notion of genre is often used to categorize music, but songs from a single genre do not necessarily sound similar and vice versa. In this work, we analyze and compare a large amount of different audio features and psychoacoustic variants thereof for the purpose of modeling timbre distance. The sound of polyphonic music is commonly described by extracting audio features on short time windows during which the sound is assumed to be stationary. The resulting down sampled time series are aggregated to form a high-level feature vector describing the music. We generated high-level features by systematically applying static and temporal statistics for aggregation. The temporal structure of features in particular has previously been largely neglected. A novel supervised feature selection method is applied to the huge set of possible features. The distances of the selected feature correspond to timbre differences in music. The features show few redundancies and have high potential for explaining possible clusters. They outperform seven other previously proposed feature sets on several datasets with respect to the separation of the known groups of timbrally different music.
Keywords
acoustic signal processing; audio signal processing; feature extraction; music; statistical analysis; audio feature extraction; high-level feature vector; polyphonic music; psychoacoustic; static statistics; supervised feature selection method; temporal statistics; timbre distance; Data mining; Data visualization; Feature extraction; Helium; Multiple signal classification; Music; Psychoacoustic models; Psychology; Statistics; Timbre; Feature generation; feature maps; feature selection; music; self-organizing; time series; visualization;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TSA.2005.860352
Filename
1561266
Link To Document