مرکز منطقه ای اطلاع رساني علوم و فناوري - Hierarchical neural networks and enhanced class posteriors for social signal classification

DocumentCode :

672380

Title :

Hierarchical neural networks and enhanced class posteriors for social signal classification

Author :

Brueckner, Raymond ; Schuller, Bjorn

Author_Institution :

Machine Intell. & Signal Process. Group, Tech. Univ. Munchen, Munich, Germany

fYear :

2013

fDate :

8-12 Dec. 2013

Firstpage :

362

Lastpage :

367

Abstract :

With the impressive advances of deep learning in recent years the interest in neural networks has resurged in the fields of automatic speech recognition and emotion recognition. In this paper we apply neural networks to address speaker-independent detection and classification of laughter and filler vocalizations in speech. We first explore modeling class posteriors with standard neural networks and deep stacked autoencoders. Then, we adopt a hierarchical neural architecture to compute enhanced class posteriors and demonstrate that this approach introduces significant and consistent improvements on the Social Signals Sub-Challenge of the Interspeech 2013 Computational Paralinguistics Challenge (ComParE). On this task we achieve a value of 92.4% of the unweighted average area-under-the-curve, which is the official competition measure, on the test set. This constitutes an improvement of 9.1% over the baseline and is the best result obtained so far on this task.

Keywords :

behavioural sciences computing; computational linguistics; emotion recognition; learning (artificial intelligence); natural language processing; neural nets; speech recognition; automatic speech recognition; deep learning; deep stacked autoencoders; emotion recognition; enhanced class posteriors; filler vocalization classification; hierarchical neural architecture; hierarchical neural networks; interspeech 2013 computational paralinguistics challenge; laughter classification; social signal classification; social signals subchallenge; speaker-independent detection; Computer architecture; Context; Hidden Markov models; Neural networks; Speech; Training; Vectors; computational paralinguistics challenge; deep autoencoder networks; enhanced posteriors; hierarchical neural networks;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location :

Olomouc

Type :

conf

DOI :

10.1109/ASRU.2013.6707757

Filename :

6707757

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=672380