Title :
Weak top-down constraints for unsupervised acoustic model training
Author :
Jansen, Anton ; Thomas, Stephan ; Hermansky, Hynek
Author_Institution :
Human Language Technol. Center of Excellence, Johns Hopkins Univ., Baltimore, MD, USA
Abstract :
Typical supervised acoustic model training relies on strong top-down constraints provided by dynamic programming alignment of the input observations to phonetic sequences derived from orthographic word transcripts and pronunciation dictionaries. This paper investigates a much weaker form of top-down supervision for use in place of transcripts and dictionaries in the zero resource setting. Our proposed constraints, which can be produced using recent spoken term discovery systems, come in the form of pairs of isolated word examples that share the same unknown type. For each pair, we perform a dynamic programming alignment of the acoustic observations of the two constituent examples, generating an inventory of cross-speaker frame pairs that each provide evidence that the same subword unit model should account for them. We find these weak top-down constraints are capable of improving model speaker independence by up to 57% relative over bottom-up training alone.
Keywords :
acoustic signal processing; dictionaries; dynamic programming; pattern clustering; speaker recognition; unsupervised learning; acoustic observations; cross-speaker frame pairs; dictionaries; dynamic programming alignment; isolated word examples; model speaker independence; spoken term discovery systems; subword unit model; transcripts; unsupervised acoustic model training; unsupervised clustering; weak top-down constraints; zero resource setting; Acoustics; Computational modeling; Dictionaries; Hidden Markov models; Speech; Training; Vectors; speaker independent acoustic models; spectral clustering; top-down constraints; unsupervised training;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639241