Title :
Automatic selection of speakers for improved acoustic modelling: recognition of disordered speech with sparse data
Author :
Christensen, H. ; Casanueva, I. ; Cunningham, S. ; Green, P. ; Hain, T.
Author_Institution :
Dept. of Comput. Sci., Univ. of Sheffield, Sheffield, UK
Abstract :
The automatic recognition of disordered speech is a domain that is characterised by limited amounts of training data for each speaker and large intra- and inter-speaker variations. This paper is concerned with how best to train an acoustic models in these circumstances; in particular, we look at how to select data for a background model from a pool of speakers for a given target speaker. We show that rather than including data from all available speakers (the standard approach in the typical speech domain), significantly better accuracy can be achieved by carefully selecting which speakers should contribute. Different methods based on measuring acoustic closeness between speakers and ranking them accordingly are investigated, and on the UASpeech isolated word recognition task, we achieve a 11.5% relative improvement compared to the baseline which uses data from all speakers. Accuracies for speakers with moderate to severe impairments are shown to improve the most with one speaker classed as having `low´ intelligibility gaining a 60% relative improvement in accuracy.
Keywords :
acoustic signal processing; handicapped aids; speaker recognition; acoustic modelling; automatic speaker selection; disordered speech recognition; dysarthric speech; physical disability; sparse data; Accuracy; Adaptation models; Data models; Silicon; Speech; Speech recognition; Training; recognition of dysarthric speech; spare data; speaker selection;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078583