• DocumentCode
    1051350
  • Title

    Time–Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking

  • Author

    Balazs, Peter ; Laback, Bernhard ; Eckel, Gerhard ; Deutsch, Werner A.

  • Author_Institution
    Acoust. Res. Inst., Austrian Acad. of Sci., Vienna, Austria
  • Volume
    18
  • Issue
    1
  • fYear
    2010
  • Firstpage
    34
  • Lastpage
    49
  • Abstract
    We present an algorithm for removing time-frequency components, found by a standard Gabor transform, of a ldquoreal-worldrdquo sound while causing no audible difference to the original sound after resynthesis. Thus, this representation is made sparser. The selection of removable components is based on a simple model of simultaneous masking in the auditory system. Important goals were the applicability to any real-world music and speech sound, integrating mutual masking effects between time-frequency components, coping with the time-frequency spread of such an operation, and computational efficiency. The proposed algorithm first determines an estimation of the masked threshold within an analysis window. The masked threshold function is then shifted in level by an amount determined experimentally, and all components falling below this function (the irrelevance threshold) are removed. This shift gives a conservative way to deal with uncertainty effects resulting from removing time-frequency components and with inaccuracies in the masking model. The removal of components is described as an adaptive Gabor multiplier. Thirty-six normal hearing subjects participated in an experiment to determine the maximum shift value for which they could not discriminate the irrelevance filtered signal from the original signal. On average across the test stimuli, 32 percent of the time-frequency components fell below the irrelevance threshold.
  • Keywords
    Gabor filters; acoustic signal processing; adaptive signal processing; hearing; music; speech; transforms; Gabor transform; adaptive Gabor multiplier; auditory system; music sound; mutual masking effect; perceptually irrelevant component; simultaneous masking; speech sound; time-frequency component; time-frequency sparsity; Efficient algorithm; Gabor filter; Gabor transform; irrelevance filter; masking model; simultaneous masking; sparse representation; spectral masking; time-variant filter;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2009.2023164
  • Filename
    5061594