DocumentCode
3246264
Title
Experiments in word-reordering and morphological preprocessing for transducer-based statistical machine translation
Author
De Gispert, Adrià ; Mariño, José B.
Author_Institution
TALP Res. Center, Univ. Politecnica de Catalunya, Barcelona, Spain
fYear
2003
fDate
30 Nov.-3 Dec. 2003
Firstpage
634
Lastpage
639
Abstract
Statistical speech translation can be achieved by an integrated search procedure that produces speech recognition and translation at the same time. Based on finite-state transducers and Viterbi search, the approach forces the rewriting of the target language in a set of modified units (with zero, one or more original words) to preserve the monotonicity of the search over the speech signal without changing the word order in the target language. In this paper, we analyse the effect of non-monotonic word alignments in two different Spanish to English translation tasks (speech-aimed small-vocabulary Verbmobil task and text-aimed large-vocabulary European Parliament task), revealing the most frequent cross patterns and experimenting with reordering strategies to improve the transducer probabilities. In addition, some preliminary results are presented on introducing POS-tagging and lemmatization, as well as some preprocessing such as categorization, to help improving the training of the system.
Keywords
language translation; speech recognition; POS-tagging; Spanish/English translation; Viterbi search; categorization; cross patterns; finite-state transducers; integrated search procedure; integrated speech recognition/translation; lemmatization; morphological preprocessing; nonmonotonic word alignments; search monotonicity; speech-aimed small vocabulary Verbmobil task; text-aimed large-vocabulary European Parliament task; transducer-based statistical machine translation; word-reordering; Acoustic transducers; Equations; Natural languages; Pattern analysis; Speech analysis; Speech processing; Speech recognition; Text recognition; Viterbi algorithm; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN
0-7803-7980-2
Type
conf
DOI
10.1109/ASRU.2003.1318514
Filename
1318514
Link To Document