Cross-lingual portability of Chinese and english neural network features for French and German LVCSR

Author

Plahl, Christian ; Schlüter, Ralf ; Ney, Hermann

Author_Institution

Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany

fYear

2011

fDate

11-15 Dec. 2011

Firstpage

371

Lastpage

376

Abstract

This paper investigates neural network (NN) based cross-lingual probabilistic features. Earlier work reports that intra-lingual features consistently outperform the corresponding cross-lingual features. We show that this may not generalize. Depending on the complexity of the NN features, cross-lingual features reduce the resources used for training -the NN has to be trained on one language only- without any loss in performance w.r.t. word error rate (WER). To further investigate this inconsistency concerning intra- vs. cross-lingual neural network features, we analyze the performance of these features w.r.t. the degree of kinship between training and testing language, and the amount of training data used. Whenever the same amount of data is used for NN training, a close relationship between training and testing language is required to achieve similar results. By increasing the training data the relationship becomes less, as well as changing the topology of the NN to the bottle neck structure. Moreover, cross-lingual features trained on English or Chinese improve the best intra-lingual system for German up to 2% relative in WER and up to 3% relative for French and achieve the same improvement as for discriminative training. Moreover, we gain again up to 8% relative in WER by combining intra- and cross-lingual systems.

Keywords

learning (artificial intelligence); natural language processing; neural nets; probability; speech recognition; vocabulary; Chinese neural network features; English neural network features; French LVCSR; German LVCSR; NN training; WER; bottleneck structure; cross-lingual neural network features; cross-lingual portability; cross-lingual probabilistic features; discriminative training; intra-lingual features; intra-lingual neural network features; intra-lingual system; testing language; topology; training data; training language; w.r.t; word error rate; Artificial neural networks; Feature extraction; Hidden Markov models; Neck; Probabilistic logic; Testing; Training; LVCSR; cross-lingual portability; feature extraction; neural network;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on

Conference_Location

Waikoloa, HI

Print_ISBN

978-1-4673-0365-1

Electronic_ISBN

978-1-4673-0366-8

Type

conf

DOI

10.1109/ASRU.2011.6163960

Filename

6163960