On-line continuous-time music mood regression with deep recurrent neural networks

Author

Weninger, Felix ; Eyben, Florian ; Schuller, Bjorn

Author_Institution

Machine Intell. & Signal Process. Group, Tech. Univ. Munchen, München, Germany

fYear

2014

fDate

4-9 May 2014

Firstpage

5412

Lastpage

5416

Abstract

This paper proposes a novel machine learning approach for the task of on-line continuous-time music mood regression, i.e., low-latency prediction of the time-varying arousal and valence in musical pieces. On the front-end, a large set of segmental acoustic features is extracted to model short-term variations. Then, multi-variate regression is performed by deep recurrent neural networks to model longer-range context and capture the time-varying emotional profile of musical pieces appropriately. Evaluation is done on the 2013 MediaEval Challenge corpus consisting of 1000 pieces annotated in continous time and continuous arousal and valence by crowd-sourcing. In the result, recurrent neural networks outperform SVR and feedforward neural networks both in continuous-time and static music mood regression, and achieve an R² of up to .70 and .50 with arousal and valence annotations.

Keywords

emotion recognition; feature extraction; learning (artificial intelligence); music; recurrent neural nets; regression analysis; MediaEval challenge corpus; SVR; crowd-sourcing; deep recurrent neural network; feedforward neural networks; longer-range context; low-latency prediction; machine learning approach; multivariate regression; musical pieces valence; musical time-varying arousal; online continuous-time music mood regression; segmental acoustic features; short-term variations model; static music mood regression; support vector regression; time-varying emotional profile; Acoustics; Emotion recognition; Maximum likelihood estimation; Mood; Recurrent neural networks; Training; emotion recognition; music information retrieval; recurrent neural networks;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854637

Filename

6854637