Title :
Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification
Author :
Nugraha, Aditya Arie ; Yamamoto, Koji ; Nakagawa, Sachiko
Author_Institution :
Dept. of Comput. Sci. & Eng., Toyohashi Univ. of Technol., Toyohashi, Japan
fDate :
Oct. 29 2013-Nov. 1 2013
Abstract :
In this paper, we present a feature enhancement method that uses neural networks (NNs) to map the reverberant feature in a log-melspectral domain to its corresponding anechoic feature. The mapping is done by cascade NNs trained using Cascade 2 algorithm with an implementation of segment-based normalization. We assumed that the dimensions of feature were independent from each other and experimented on several assumptions of the room transfer function for each dimension. Speaker identification system was used to evaluate the method. Using limited stereo data, we could improve the identification rate for simulated and real datasets. On the simulated dataset, we could show that the proposed method is effective for both noiseless and noisy reverberant environments, with various noise and reverberation characteristics. On the real dataset, we could show that by using 6 independent NNs configuration for 24-dimensional feature and only 1 pair of utterances we could get 35% average error reduction relative to the baseline, which employed cepstral mean normalization (CMN).
Keywords :
anechoic chambers (acoustic); cepstral analysis; feature extraction; microphones; neural nets; reverberation; speaker recognition; transfer functions; Cascade2 algorithm; anechoic feature; automatic speaker recognition; automatic speech recognition; average error reduction; cepstral mean normalization; distant speaker identification; distant-talking microphone; feature enhancement method; identification rate improvement; limited stereo data; log-melspectral domain; neural networks; noiseless reverberant environments; noisy reverberant environments; reverberant feature map; room transfer function; segment-based normalization; single channel dereverberation method; Artificial neural networks; Neurons; Noise measurement; Reverberation; Speech; Speech recognition; Training data;
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location :
Kaohsiung
DOI :
10.1109/APSIPA.2013.6694117