مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker diarization through speaker embeddings

DocumentCode :

3716200

Title :

Speaker diarization through speaker embeddings

Author :

Mickael Rouvier;Pierre-Michel Bousquet;Benoit Favre

Author_Institution :

Aix-Marseille Université

fYear :

2015

Firstpage :

2082

Lastpage :

2086

Abstract :

This paper proposes to learn a set of high-level feature representations through deep learning, referred to as Speaker Embeddings, for speaker diarization. Speaker Embedding features are taken from the hidden layer neuron activations of Deep Neural Networks (DNN), when learned as classifiers to recognize a thousand speaker identities in a training set. Although learned through identification, speaker embeddings are shown to be effective for speaker verification in particular to recognize speakers unseen in the training set. In particular, this approach is applied to speaker diarization. Experiments, conducted on the corpus of French broadcast news ETAPE, show that this new speaker modeling technique decreases DER by 1.67 points (a relative improvement of about 8% DER).

Keywords :

"Training","Density estimation robust algorithm","Speech","Neurons","Feature extraction","Europe","Signal processing"

Publisher :

ieee

Conference_Titel :

Signal Processing Conference (EUSIPCO), 2015 23rd European

Electronic_ISBN :

2076-1465

Type :

conf

DOI :

10.1109/EUSIPCO.2015.7362751

Filename :

7362751

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3716200