• DocumentCode
    2838734
  • Title

    Predicting prosodic words from lexical words - a first step towards predicting prosody from text

  • Author

    Peng, Hua-Jui ; Chen, Chi-Ching ; Tseng, Chiu-Yu ; Chen, Keh-Jiann

  • Author_Institution
    Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
  • fYear
    2004
  • fDate
    15-18 Dec. 2004
  • Firstpage
    173
  • Lastpage
    176
  • Abstract
    Much remains unsolved in how to predict prosody from text for unlimited Mandarin Chinese TTS. The interactions and the rules between syntactic structure and prosodic structure are still unresolved challenges. By using part-of-speech (POS) tagging, for which text lexical information is required, we aim to find significant patterns of word grouping from analyzing real speech data and such lexical information. The paper reports discrepancies found between lexical words (LW) parsed from text and prosodic words (PW) annotated from speech data, and proposes a statistical model to predict PWs from LWs. In the statistical model, the length of the word and the tagging from POS are two essential features to predict PWs, and the results show approximately 90% of prediction for PWs; however, it does leave more room for extension. We believe that evidence from PW predictions is a first step towards building prosody models from text.
  • Keywords
    natural language interfaces; speech synthesis; statistical analysis; text analysis; lexical words; part-of-speech tagging; prosodic structure; prosodic word prediction; rule-based model; statistical model; syntactic structure; unlimited Mandarin Chinese TTS; unlimited TTS; word grouping; Algorithm design and analysis; Data analysis; Frequency; Government; Information analysis; Information science; Predictive models; Speech analysis; Speech synthesis; Tagging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2004 International Symposium on
  • Print_ISBN
    0-7803-8678-7
  • Type

    conf

  • DOI
    10.1109/CHINSL.2004.1409614
  • Filename
    1409614