DocumentCode :
2175007
Title :
Improved models for Mandarin speech-to-text transcription
Author :
Lamel, Lori ; Gauvain, Jean-Luc ; Le, Viet Bac ; Oparin, Ilya ; Meng, Sha
Author_Institution :
Spoken Language Process. Group, LIMSI-CNRS, Orsay, France
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
4660
Lastpage :
4663
Abstract :
This paper describes recent advances at LIMSI in Mandarin Chinese speech-to-text transcription. A number of novel approaches were introduced in the different system components. The acoustic models are trained on over 1600 hours of audio data from a range of sources, and include pitch and MLP features. N-gram and neural network language models are trained on very large corpora, over 3 billion words of texts; and LM adaptation was explored at different adaptation levels: per show, per snippet, or per speaker cluster. Character-based consensus decoding was found to outperform word-based consensus decoding for Mandarin. The improved system reduces the relative character error rate (CER) by about 10% on previous GALE development and evaluation data sets, obtaining a CER of 9.2% on the P4 broadcast news and broadcast conversation evaluation data.
Keywords :
speech processing; CER; LIMSI; MLP features; Mandarin speech-to-text transcription; N-gram language models; acoustic models; character error rate; character-based consensus decoding; neural network language models; Adaptation models; Artificial neural networks; Decoding; Interpolation; Speech; Speech recognition; Training; Mandarin; character error rate; speech recognition; speech-to-text transcription;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947394
Filename :
5947394
Link To Document :
بازگشت