• DocumentCode
    22887
  • Title

    Deep Scattering Spectrum

  • Author

    Anden, J. ; Mallat, S.

  • Author_Institution
    Centre de Math. Appl., Ecole Polytech., Palaiseau, France
  • Volume
    62
  • Issue
    16
  • fYear
    2014
  • fDate
    Aug.15, 2014
  • Firstpage
    4114
  • Lastpage
    4128
  • Abstract
    A scattering transform defines a locally translation invariant representation which is stable to time-warping deformation. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.
  • Keywords
    acoustic wave scattering; amplitude modulation; audio signal processing; cepstral analysis; signal classification; signal representation; GTZAN database; MFCC; TIMIT database; audio classification; deep scattering spectrum; frequency transposition invariant representation; mel-frequency cepstral coefficients; modulus operators; musical genre; phone classification; scattering transform; second-order scattering coefficients; spectrum coefficients; time-warping deformation; transient phenomena; wavelet convolutions; Convolution; Frequency modulation; Scattering; Spectrogram; Wavelet analysis; Wavelet transforms; Audio classification; MFCC; deep neural networks; modulation spectrum; wavelets;
  • fLanguage
    English
  • Journal_Title
    Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1053-587X
  • Type

    jour

  • DOI
    10.1109/TSP.2014.2326991
  • Filename
    6822556