Title of article :
Unsupervised training of acoustic models for large vocabulary continuous speech recognition
Author/Authors :
H.، Ney, نويسنده , , F.، Wessel, نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2004
Abstract :
For large vocabulary continuous speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were thus recorded from various sources and had to be transcribed manually. It is thus desirable to train a recognizer with as little manually transcribed acoustic data as possible. Since untranscribed speech is available in various forms nowadays, the unsupervised training of a speech recognizer on recognized transcriptions is studied in this paper. A low-cost recognizer trained with between one and six h of manually transcribed speech is used to recognize 72 h of untranscribed acoustic data. These transcriptions are then used in combination with a confidence measure to train an improved recognizer. The effect of the confidence measure which is used to detect possible recognition errors is studied systematically. Finally, the unsupervised training is applied iteratively. Starting with only one h of transcribed acoustic data, a recognition system is trained fully automatically. With this iterative training procedure, the word error rates are reduced from 71.3% to 38.3% on the Broadcast Newsʹ96 evaluation test set and from 65.6% to 29.3% on the Broadcast Newsʹ98 evaluation test set. In comparison with an optimized system trained with the manually generated transcriptions of the complete 72 h training corpus, the word error rates increase by 14.3% relative and 18.6% relative, respectively.
Keywords :
Food patterns , waist circumference , Abdominal obesity , Prospective study
Journal title :
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
Journal title :
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING