Lattice-based unsupervised acoustic model training

Author

Fraga-Silva, Thiago ; Gauvain, Jean-Luc ; Lamel, Lori

Author_Institution

Spoken Language Process. Group, LIMSI-CNRS, Orsay, France

fYear

2011

fDate

22-27 May 2011

Firstpage

4656

Lastpage

4659

Abstract

Unsupervised acoustic model training has been successfully used to improve the performance of automatic speech recognition systems when only a small amount of manually transcribed data is available for the target domain. The most common approach is use automatic transcriptions to guide acoustic model estimation. However, since the best recognition hypotheses are known to contain errors, we propose to consider multiple transcription hypotheses during training. The idea is that the EM process can benefit from the estimated posterior probabilities of the hypotheses to converge to a better solution. The proposed unsupervised training method is based on lattices. Lattice-based training gives a relative improvement of 2.2% over 1-best training on a Broadcast News transcription task and converges faster with the iterative incremental training.

Keywords

speech recognition; EM process; automatic speech recognition system; broadcast news transcription task; iterative incremental training; lattice-based unsupervised acoustic model training; target domain; Acoustics; Data models; Hidden Markov models; Lattices; Speech recognition; Training; Training data; Acoustic Modeling; Lattice-based training; Speech recognition; Unsupervised training;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location

Prague

ISSN

1520-6149

Print_ISBN

978-1-4577-0538-0

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2011.5947393

Filename

5947393