Speech enhancement using Discriminative Random Fields

Author

Ajey Saligrama; Ranjani H. G.;H. N. Shankarz;R. Muralishankarx

Author_Institution

PESIT-Bangalore South Campus, India

fYear

2015

Firstpage

1

Lastpage

6

Abstract

Speech enhancement in stationary noise is addressed using the ideal channel selection framework. In order to estimate the binary mask, we propose to classify each time-frequency (T-F) bin of the noisy signal as speech or noise using Discriminative Random Fields (DRF). The DRF function contains two terms - an enhancement function and a smoothing term. On each T-F bin, we propose to use an enhancement function based on likelihood ratio test for speech presence, while Ising model is used as smoothing function for spectro-temporal continuity in the estimated binary mask. The effect of the smoothing function over successive iterations is found to reduce musical noise as opposed to using only enhancement function. The binary mask is inferred from the noisy signal using Iterated Conditional Modes (ICM) algorithm. Sentences from NOIZEUS corpus are evaluated from 0 dB to 15 dB Signal to Noise Ratio (SNR) in 4 kinds of additive noise settings: additive white Gaussian noise, car noise, street noise and pink noise. The reconstructed speech using the proposed technique is evaluated in terms of average segmental SNR, Perceptual Evaluation of Speech Quality (PESQ) and Mean opinion Score (MOS).

Keywords

"Speech","Signal to noise ratio","Transforms","AWGN","Additives","Indexes","Noise measurement"

Publisher

ieee

Conference_Titel

TENCON 2015 - 2015 IEEE Region 10 Conference

ISSN

2159-3442

Print_ISBN

978-1-4799-8639-2

Electronic_ISBN

2159-3450

Type

conf

DOI

10.1109/TENCON.2015.7373173

Filename

7373173