DocumentCode
3485725
Title
Cross-lingual portability of Chinese and english neural network features for French and German LVCSR
Author
Plahl, Christian ; Schlüter, Ralf ; Ney, Hermann
Author_Institution
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
fYear
2011
fDate
11-15 Dec. 2011
Firstpage
371
Lastpage
376
Abstract
This paper investigates neural network (NN) based cross-lingual probabilistic features. Earlier work reports that intra-lingual features consistently outperform the corresponding cross-lingual features. We show that this may not generalize. Depending on the complexity of the NN features, cross-lingual features reduce the resources used for training -the NN has to be trained on one language only- without any loss in performance w.r.t. word error rate (WER). To further investigate this inconsistency concerning intra- vs. cross-lingual neural network features, we analyze the performance of these features w.r.t. the degree of kinship between training and testing language, and the amount of training data used. Whenever the same amount of data is used for NN training, a close relationship between training and testing language is required to achieve similar results. By increasing the training data the relationship becomes less, as well as changing the topology of the NN to the bottle neck structure. Moreover, cross-lingual features trained on English or Chinese improve the best intra-lingual system for German up to 2% relative in WER and up to 3% relative for French and achieve the same improvement as for discriminative training. Moreover, we gain again up to 8% relative in WER by combining intra- and cross-lingual systems.
Keywords
learning (artificial intelligence); natural language processing; neural nets; probability; speech recognition; vocabulary; Chinese neural network features; English neural network features; French LVCSR; German LVCSR; NN training; WER; bottleneck structure; cross-lingual neural network features; cross-lingual portability; cross-lingual probabilistic features; discriminative training; intra-lingual features; intra-lingual neural network features; intra-lingual system; testing language; topology; training data; training language; w.r.t; word error rate; Artificial neural networks; Feature extraction; Hidden Markov models; Neck; Probabilistic logic; Testing; Training; LVCSR; cross-lingual portability; feature extraction; neural network;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location
Waikoloa, HI
Print_ISBN
978-1-4673-0365-1
Electronic_ISBN
978-1-4673-0366-8
Type
conf
DOI
10.1109/ASRU.2011.6163960
Filename
6163960
Link To Document