• DocumentCode
    387962
  • Title

    Speech analysis/Synthesis based on matching the synthesized and the original representations in the auditory nerve level

  • Author

    Ghitza, Oded

  • Author_Institution
    AT&T Bell Laboratories, Murray Hill, NJ, USA
  • Volume
    11
  • fYear
    1986
  • fDate
    31503
  • Firstpage
    1995
  • Lastpage
    1998
  • Abstract
    Traditional speech analysis/synthesis techniques are designed to produce synthesized speech with a spectrum (or waveform) which is as close as possible to the original. It is suggested, instead, to match the in-synchrony-bands spectrum measures (Ghitza, ICASSP-85, Tampa FL., Vol.2, p. 505) of the synthetic and the original speech. This concept has been used in conjunction with a sinusoidal representation type of speech analysis/synthesis (McAulay and Quatieri, Lincoln Laboratory Technical Report 693, May 1985). Based on informal listening, the resulting speech is natural (with some tonal artifact) and highly intelligible both in quiet and noisy environments. The same performance is obtained with two overlapping superposed speech waveforms, music waveforms, and speech in musical background. These results demonstrate the adequacy of the in-synchrony-bands measure in selecting the perceptually meaningful frequency regions of the stimulus spectra. Moreover, the inherent dominance property of this measure significantly reduces the number of sinusoidal components needed for synthesis by approximately 70 percent, offering the potential for reduced data-rate.
  • Keywords
    Acoustic measurements; Frequency estimation; Frequency measurement; Frequency synchronization; Frequency synthesizers; Laboratories; Speech analysis; Speech synthesis; Vocoders; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.
  • Type

    conf

  • DOI
    10.1109/ICASSP.1986.1169191
  • Filename
    1169191