A VQ-Based Single-Channel Audio Separation for Music/Speech Mixtures

Author

Asgari, Meysam ; Fallah, Mahdi ; Mehrizi, Elahe Abouie ; Mostafavi, Ali

Author_Institution

Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran

fYear

2009

fDate

25-27 March 2009

Firstpage

223

Lastpage

227

Abstract

In this paper, we address the problem of audio source separation with one single sensor, based on estimation of statistical model of the sources. We improve the-state-of the art vector quantization (VQ) by considering apriori histograms of huge training data. This will result in a more accurate codebook for each source in contrast to the commonly used Linde-Buzo-Gray (LBG) algorithm. An optimum estimator is introduced in separation stage based on discrete fourier transform (DFT) amplitudes. Finally, conducting different simulations it is demonstrated that proposed approach efficiently segregated audio mixtures in terms of signal to distortion ratio (SDR) measures as well as mean opinion score (MOS) criterion.

Keywords

audio coding; discrete Fourier transforms; music; source separation; speech coding; statistical analysis; vector quantisation; DFT; Linde-Buzo-Gray algorithm; VQ-based single-channel audio source separation; apriori histogram; discrete fourier transform; music-speech mixture; statistical model estimation; vector quantization; Discrete Fourier transforms; Distortion measurement; Electronic mail; Hidden Markov models; Independent component analysis; Instruments; Psychoacoustic models; Spectrogram; Speech; Vector quantization;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Modelling and Simulation, 2009. UKSIM '09. 11th International Conference on

Conference_Location

Cambridge

Print_ISBN

978-1-4244-3771-9

Electronic_ISBN

978-0-7695-3593-7

Type

conf

DOI

10.1109/UKSIM.2009.123

Filename

4809767