Title :
Cross-language bootstrapping based on completely unsupervised training using multilingual A-stabil
Author :
Vu, Ngoc Thang ; Kraus, Franziska ; Schultz, Tanja
Abstract :
This paper presents our work on rapid language adaptation of acoustic models based on multilingual cross-language bootstrapping and unsupervised training. We used Automatic Speech Recognition (ASR) systems in English, French, German, and Spanish to build a Czech ASR system from scratch. System building was performed without using any transcribed audio data by applying three consecutive steps, i.e. cross-language transfer, unsupervised training based on the "multilingual A-stabil" confidence score, and boot strapping. Based on the confidence score we selected 72% (16.6 hours) of the available audio data with a transcription WER of less than 14.5%. The cross-language bootstrap achieves a word error rate of 23.3% on the Czech development set and 22.4% on the evaluation set. These results are very promising as the performance compares favorably to the Czech ASR system which was trained on 23 hours of manually transcribed data (21.8% on the development set and 21.3% on the evaluation set).
Keywords :
bootstrapping; speech recognition; Czech ASR system; audio data; automatic speech recognition; multilingual a-stabil; multilingual cross-language bootstrapping; unsupervised training; Acoustics; Adaptation models; Data models; Hidden Markov models; Speech recognition; Training; Training data; multilingual A-Stabil; rapid language adaptation of ASR; unsupervised training;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947479