DocumentCode :
106284
Title :
Speaker identification using multimodal neural networks and wavelet analysis
Author :
Almaadeed, Noor ; Aggoun, Amar ; Amira, Abbes
Author_Institution :
Dept. of Comput. Eng., Brunel Univ., Uxbridge, UK
Volume :
4
Issue :
1
fYear :
2015
fDate :
3 2015
Firstpage :
18
Lastpage :
28
Abstract :
The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. Wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.
Keywords :
Gaussian processes; audio databases; backpropagation; biometrics (access control); cepstral analysis; discrete wavelet transforms; mixture models; principal component analysis; radial basis function networks; speaker recognition; text analysis; GRID database corpora; Gaussian mixture model; MFCC; Mel-frequency cepstral coefficients; back-propagation neural network; biometric authentication systems; discrete wavelet transform; general regressive neural networks; learning module; majority voting scheme; multimodal neural networks; principal component analysis; probabilistic neural networks; radial basis function neural networks; text-independent multimodal speaker identification system; wavelet analysis; wavelet packet transform; wavelet subband coding;
fLanguage :
English
Journal_Title :
Biometrics, IET
Publisher :
iet
ISSN :
2047-4938
Type :
jour
DOI :
10.1049/iet-bmt.2014.0011
Filename :
7062142
Link To Document :
بازگشت