Title :
Learning emotion-based acoustic features with deep belief networks
Author :
Schmidt, Erik M. ; Kim, Youngmoo E.
Author_Institution :
Music & Entertainment Technol. Lab. (MET-Lab.), Drexel Univ., Philadelphia, PA, USA
Abstract :
The medium of music has evolved specifically for the expression of emotions, and it is natural for us to organize music in terms of its emotional associations. But while such organization is a natural process for humans, quantifying it empirically proves to be a very difficult task, and as such no dominant feature representation for music emotion recognition has yet emerged. Much of the difficulty in developing emotion-based features is the ambiguity of the ground-truth. Even using the smallest time window, opinions on the emotion are bound to vary and reflect some disagreement between listeners. In previous work, we have modeled human response labels to music in the arousal-valence (A-V) representation of affect as a time-varying, stochastic distribution. Current methods for automatic detection of emotion in music seek performance increases by combining several feature domains (e.g. loudness, timbre, harmony, rhythm). Such work has focused largely in dimensionality reduction for minor classification performance gains, but has provided little insight into the relationship between audio and emotional associations. In this new work we seek to employ regression-based deep belief networks to learn features directly from magnitude spectra. While the system is applied to the specific problem of music emotion recognition, it could be easily applied to any regression-based audio feature learning problem.
Keywords :
audio signal processing; belief networks; emotion recognition; music; acoustic features; audio feature learning problem; expression of emotions; learning emotion; magnitude spectra; music emotion recognition; regression based deep belief networks; Emotion recognition; Humans; Machine learning; Music; Topology; Training; Emotion recognition; deep belief networks; feature learning; regression;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011 IEEE Workshop on
Conference_Location :
New Paltz, NY
Print_ISBN :
978-1-4577-0692-9
Electronic_ISBN :
1931-1168
DOI :
10.1109/ASPAA.2011.6082328