DocumentCode
1184266
Title
Applications of positive time-frequency distributions to speech processing
Author
Pitton, James W. ; Atlas, Les E. ; Loughlin, Patrick J.
Author_Institution
AT&T Bell Labs., Murray Hill, NJ, USA
Volume
2
Issue
4
fYear
1994
fDate
10/1/1994 12:00:00 AM
Firstpage
554
Lastpage
566
Abstract
Much of our current knowledge and intuition of speech is derived from analyses involving assumptions of short-time stationarity (e.g., the speech spectrogram). Such methods are, by their very nature, incapable of revealing the true nonstationary nature of speech. A careful consideration of the theory of time-frequency distributions (TFDs), however, allows the construction of methods that reveal far more of the nonstationarities of speech, thereby highlighting just what it is that conventional approaches miss. We apply two iterative methods for generating positive time-frequency distributions (TFDs) to speech analysis. Both methods make use of multiple sources of information (e.g., multiple spectrograms) to yield a high-resolution estimate of the joint time-frequency energy density of speech. Plosive events and formant harmonic structure are simultaneously preserved in these TFDs. Rapidly time-varying formants are also resolved by these TFDs, and harmonic structure is revealed, independent of sweep rate; this result is quite different from that seen with conventional speech spectrograms. The speech features observed in these distributions demonstrate that conventional sliding window techniques lose or distort much of the rich nonstationary structure of speech. Examples for synthetic formants and real speech are provided. The differences between joint distributions and conditional distributions are also illustrated
Keywords
iterative methods; speech analysis and processing; statistical analysis; time-frequency analysis; conditional distributions; formant harmonic structure; high-resolution estimate; iterative methods; joint distributions; plosive events; positive time-frequency distributions; real speech; short-time stationarity; speech analysis; speech processing; speech spectrogram; sweep rate; synthetic formants; time-frequency energy density; time-varying formants; Bandwidth; Density functional theory; Frequency estimation; Information resources; Iterative methods; Spectrogram; Speech analysis; Speech processing; Time frequency analysis; Yield estimation;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.326614
Filename
326614
Link To Document