DocumentCode :
972290
Title :
Time-Scale Modification of Audio Signals Using Enhanced WSOLA With Management of Transients
Author :
Grofit, Shahaf ; Lavner, Yizhar
Author_Institution :
Sch. of Comput. Sci., Tel-Aviv Univ., Tel-Aviv
Volume :
16
Issue :
1
fYear :
2008
Firstpage :
106
Lastpage :
115
Abstract :
In this paper, we present an algorithm for time-scale modification of music signals, based on the waveform similarity overlap-and-add technique (WSOLA). A well-known disadvantage of the standard WSOLA is the uniform time-scaling of the entire signal, including the perceptually significant transient sections (PSTs), where temporal envelope changes as well as significant spectral transitions occur. Time-scaling of PSTs can severely degrade the music quality. We address this problem by detecting the PSTs and leaving them intact, while time-scaling the remainder of the signal, which is relatively steady-state. In the proposed algorithm, the PSTs are detected using a Mel frequency cepstrum nonstationarity measure and the normalized cross-correlation, with time-varying threshold functions. Our study shows that the accurate detection of PSTs within the WSOLA framework makes it possible to achieve a higher quality of time-scaled music, as confirmed by subjective listening tests.
Keywords :
audio signal processing; cepstral analysis; correlation methods; music; Mel frequency cepstrum nonstationarity measure; audio signal processing; music signal processing; normalized cross-correlation; perceptually significant transient section detection; spectral transition; temporal envelope; time-scale modification; time-varying threshold function; transient management; waveform similarity overlap-and-add technique; Audio recording; Cepstrum; Computer science; Degradation; Frequency; Instruments; Multiple signal classification; Music; Speech; Time domain analysis; Mel frequency cepstrum; spectral variation; time-scale modification of audio and music signals; waveform similarity overlap-and-add (WSOLA);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2007.909444
Filename :
4381234
Link To Document :
بازگشت