DocumentCode
22887
Title
Deep Scattering Spectrum
Author
Anden, J. ; Mallat, S.
Author_Institution
Centre de Math. Appl., Ecole Polytech., Palaiseau, France
Volume
62
Issue
16
fYear
2014
fDate
Aug.15, 2014
Firstpage
4114
Lastpage
4128
Abstract
A scattering transform defines a locally translation invariant representation which is stable to time-warping deformation. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.
Keywords
acoustic wave scattering; amplitude modulation; audio signal processing; cepstral analysis; signal classification; signal representation; GTZAN database; MFCC; TIMIT database; audio classification; deep scattering spectrum; frequency transposition invariant representation; mel-frequency cepstral coefficients; modulus operators; musical genre; phone classification; scattering transform; second-order scattering coefficients; spectrum coefficients; time-warping deformation; transient phenomena; wavelet convolutions; Convolution; Frequency modulation; Scattering; Spectrogram; Wavelet analysis; Wavelet transforms; Audio classification; MFCC; deep neural networks; modulation spectrum; wavelets;
fLanguage
English
Journal_Title
Signal Processing, IEEE Transactions on
Publisher
ieee
ISSN
1053-587X
Type
jour
DOI
10.1109/TSP.2014.2326991
Filename
6822556
Link To Document