• DocumentCode
    1184266
  • Title

    Applications of positive time-frequency distributions to speech processing

  • Author

    Pitton, James W. ; Atlas, Les E. ; Loughlin, Patrick J.

  • Author_Institution
    AT&T Bell Labs., Murray Hill, NJ, USA
  • Volume
    2
  • Issue
    4
  • fYear
    1994
  • fDate
    10/1/1994 12:00:00 AM
  • Firstpage
    554
  • Lastpage
    566
  • Abstract
    Much of our current knowledge and intuition of speech is derived from analyses involving assumptions of short-time stationarity (e.g., the speech spectrogram). Such methods are, by their very nature, incapable of revealing the true nonstationary nature of speech. A careful consideration of the theory of time-frequency distributions (TFDs), however, allows the construction of methods that reveal far more of the nonstationarities of speech, thereby highlighting just what it is that conventional approaches miss. We apply two iterative methods for generating positive time-frequency distributions (TFDs) to speech analysis. Both methods make use of multiple sources of information (e.g., multiple spectrograms) to yield a high-resolution estimate of the joint time-frequency energy density of speech. Plosive events and formant harmonic structure are simultaneously preserved in these TFDs. Rapidly time-varying formants are also resolved by these TFDs, and harmonic structure is revealed, independent of sweep rate; this result is quite different from that seen with conventional speech spectrograms. The speech features observed in these distributions demonstrate that conventional sliding window techniques lose or distort much of the rich nonstationary structure of speech. Examples for synthetic formants and real speech are provided. The differences between joint distributions and conditional distributions are also illustrated
  • Keywords
    iterative methods; speech analysis and processing; statistical analysis; time-frequency analysis; conditional distributions; formant harmonic structure; high-resolution estimate; iterative methods; joint distributions; plosive events; positive time-frequency distributions; real speech; short-time stationarity; speech analysis; speech processing; speech spectrogram; sweep rate; synthetic formants; time-frequency energy density; time-varying formants; Bandwidth; Density functional theory; Frequency estimation; Information resources; Iterative methods; Spectrogram; Speech analysis; Speech processing; Time frequency analysis; Yield estimation;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.326614
  • Filename
    326614