• DocumentCode
    336788
  • Title

    Using a sigmoid transformation for improved modeling of phoneme duration

  • Author

    Silverman, Kim E A ; Bellegarda, Jerome R.

  • Author_Institution
    Spoken Language Group, Apple Comput. Inc., Cupertino, CA, USA
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    385
  • Abstract
    The “sums-of-products” approach has emerged as one of the most promising avenues to model contextual influences on phoneme duration. The associated regression is generally applied after log-transforming the durations. This paper presents empirical and theoretical evidence which suggests that this transformation is not optimal. A promising alternative solution is proposed, based on a sigmoid function. Preliminary experimental results obtained on over 50,000 phonemes in varied prosodic contexts show that this transformation reduces the unexplained deviations in the data by more than 30%. Alternatively, for a given level of performance, it halves the number of parameters required by the model
  • Keywords
    speech synthesis; transforms; contextual influences; performance; phoneme duration; prosodic context; regression; sigmoid transformation; sums-of-products approach; Classification tree analysis; Context modeling; Decision trees; Linear regression; Natural languages; Neural networks; Robustness; Speech; Stress; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758143
  • Filename
    758143