DocumentCode :
3245501
Title :
Improving the robustness of prosody dependent language modeling based on prosody syntax dependence
Author :
Chen, Ken ; Hasegawa-Johnson, Mark
Author_Institution :
Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL, USA
fYear :
2003
fDate :
30 Nov.-3 Dec. 2003
Firstpage :
435
Lastpage :
440
Abstract :
The paper presents a novel approach that improves the robustness of prosody dependent language modeling by leveraging the dependence between prosody and syntax. A prosody dependent language model describes the joint probability distribution of concurrent word and prosody sequences and can be used to provide prior language constraints in a prosody dependent speech recognizer. Robust maximum likelihood (ML) estimation of prosody dependent n-gram language models requires a large amount of prosodically transcribed data. We show that prosody-syntax dependence can be utilized to diminish the data sparseness introduced by prosody dependent modeling. Experiments on a radio news corpus show that the prosody dependent language model estimated using our approach reduces the joint perplexity by up to 34% as compared with the standard ML-estimated prosody dependent language model; the word perplexity can be reduced by up to 84% as compared with the standard ML-estimated prosody independent language model. In recognition experiments, the language model estimated by our approach create an improvement of 1% in word recognition accuracy, 0.7% in accent recognition accuracy and 1.5% in intonational phrase boundary (IPB) recognition accuracy over a baseline prosody dependent language model.
Keywords :
linguistics; maximum likelihood estimation; natural languages; probability; sequences; speech recognition; accent recognition; data sparseness; intonational phrase boundary recognition; maximum likelihood estimation; prosody dependent language modeling; prosody dependent speech recognizer; prosody syntax dependence; radio news corpus; word perplexity; word recognition; Aging; Automatic speech recognition; Maximum likelihood detection; Maximum likelihood estimation; Natural languages; Probability distribution; Rhythm; Robustness; Speech recognition; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
Type :
conf
DOI :
10.1109/ASRU.2003.1318480
Filename :
1318480
Link To Document :
بازگشت