Title :
Prosodic word prediction using the lexical information
Author :
Dong, Honghui ; Tao, Jianhua ; Xu, Bo
Author_Institution :
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., Beijing, China
fDate :
30 Oct.-1 Nov. 2005
Abstract :
As a basic prosodic unit, the prosodic word influences the naturalness and the intelligibility greatly. Although the relation between the lexicon word and the prosodic word has been widely discussed, the prosodic word prediction still cannot reach a high precision and recall. In this paper, the study shows the lexical features are more efficient in prosodic word prediction. Based on careful analysis on the mapping relationship and the difference between the lexicon words and the prosodic words, this paper proposes two methods to predict prosodic words from the lexicon words sequence. The first is a statistic probability model, which efficiently combines the local POS and word length information. Experiments show that by choosing appropriate threshold this statistic model can reach a high precision and high recall ratio. Another is an SVM-based method. In this SVM classifier, an efficient feature is introduced. Besides the POS and the word length features, the in-word-probability (IWP) is used. IWP means the probability of a lexicon word to be in a prosodic word. The precision and the recall ratio are improved greatly after using SVM classifier and the IWP feature.
Keywords :
natural languages; pattern classification; probability; speech processing; statistical analysis; support vector machines; text analysis; POS; SVM classifier; in-word-probability; lexicon words sequence; prosodic word prediction; statistic probability model; word length information; Automation; Laboratories; Pattern recognition; Predictive models; Probability; Speech synthesis; Statistics; Support vector machine classification; Support vector machines; Technological innovation;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
DOI :
10.1109/NLPKE.2005.1598732