DocumentCode
166087
Title
A comparison of Multi-Layer Perceptron and Radial Basis Function neural network in the voice conversion framework
Author
Chadha, Ankita N. ; Nirmal, Jagannath H. ; Zaveri, Mukesh A.
Author_Institution
Dept. of Electron. Eng., K.J. Somaiya Coll. of Eng., Mumbai, India
fYear
2014
fDate
24-27 Sept. 2014
Firstpage
1045
Lastpage
1052
Abstract
The voice conversion system modifies the speaker specific features of the source speaker so that it sounds like a target speaker speech. The voice individuality of the speech signal is characterized at various levels such as shape of the glottal excitation, shape of the vocal tract and the long term prosodic features. In this work, Line Spectral Frequencies (LSF) are used to represent the shape of the vocal tract and Linear Predictive (LP) residual represents the shape of the glottal excitation of a particular speaker. A Multi Layer Perceptron (MLP) and Radial Basis Function (RBF) based neural network are explored to formulate the nonlinear mapping for modifying the LSFs. The baseline residual selection method is used to modify the LP-residual of one speaker to that of another speaker. A relative comparison between MLP and RBF are carried out using various objective and subjective measures for inter-gender and intra-gender voice conversion. The results reveal that an optimized RBF performs slightly better than baseline MLP based voice conversion.
Keywords
multilayer perceptrons; radial basis function networks; speaker recognition; speech processing; MLP; RBF; baseline residual selection method; glottal excitation; intergender voice conversion; intragender voice conversion; line spectral frequencies; linear predictive residual; multilayer perceptron; nonlinear mapping; prosodic features; radial basis function neural network; speaker specific feature; speech signal; vocal tract; voice conversion framework; Artificial neural networks; Feature extraction; Shape; Speech; Training; Vectors; dynamic time warping; line spectral frequencies; multi-layer perceptron; radial basis function; residual selection; voice conversion;
fLanguage
English
Publisher
ieee
Conference_Titel
Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
Conference_Location
New Delhi
Print_ISBN
978-1-4799-3078-4
Type
conf
DOI
10.1109/ICACCI.2014.6968405
Filename
6968405
Link To Document