DocumentCode :
3716200
Title :
Speaker diarization through speaker embeddings
Author :
Mickael Rouvier;Pierre-Michel Bousquet;Benoit Favre
Author_Institution :
Aix-Marseille Université
fYear :
2015
Firstpage :
2082
Lastpage :
2086
Abstract :
This paper proposes to learn a set of high-level feature representations through deep learning, referred to as Speaker Embeddings, for speaker diarization. Speaker Embedding features are taken from the hidden layer neuron activations of Deep Neural Networks (DNN), when learned as classifiers to recognize a thousand speaker identities in a training set. Although learned through identification, speaker embeddings are shown to be effective for speaker verification in particular to recognize speakers unseen in the training set. In particular, this approach is applied to speaker diarization. Experiments, conducted on the corpus of French broadcast news ETAPE, show that this new speaker modeling technique decreases DER by 1.67 points (a relative improvement of about 8% DER).
Keywords :
"Training","Density estimation robust algorithm","Speech","Neurons","Feature extraction","Europe","Signal processing"
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2015 23rd European
Electronic_ISBN :
2076-1465
Type :
conf
DOI :
10.1109/EUSIPCO.2015.7362751
Filename :
7362751
Link To Document :
بازگشت