• DocumentCode
    166087
  • Title

    A comparison of Multi-Layer Perceptron and Radial Basis Function neural network in the voice conversion framework

  • Author

    Chadha, Ankita N. ; Nirmal, Jagannath H. ; Zaveri, Mukesh A.

  • Author_Institution
    Dept. of Electron. Eng., K.J. Somaiya Coll. of Eng., Mumbai, India
  • fYear
    2014
  • fDate
    24-27 Sept. 2014
  • Firstpage
    1045
  • Lastpage
    1052
  • Abstract
    The voice conversion system modifies the speaker specific features of the source speaker so that it sounds like a target speaker speech. The voice individuality of the speech signal is characterized at various levels such as shape of the glottal excitation, shape of the vocal tract and the long term prosodic features. In this work, Line Spectral Frequencies (LSF) are used to represent the shape of the vocal tract and Linear Predictive (LP) residual represents the shape of the glottal excitation of a particular speaker. A Multi Layer Perceptron (MLP) and Radial Basis Function (RBF) based neural network are explored to formulate the nonlinear mapping for modifying the LSFs. The baseline residual selection method is used to modify the LP-residual of one speaker to that of another speaker. A relative comparison between MLP and RBF are carried out using various objective and subjective measures for inter-gender and intra-gender voice conversion. The results reveal that an optimized RBF performs slightly better than baseline MLP based voice conversion.
  • Keywords
    multilayer perceptrons; radial basis function networks; speaker recognition; speech processing; MLP; RBF; baseline residual selection method; glottal excitation; intergender voice conversion; intragender voice conversion; line spectral frequencies; linear predictive residual; multilayer perceptron; nonlinear mapping; prosodic features; radial basis function neural network; speaker specific feature; speech signal; vocal tract; voice conversion framework; Artificial neural networks; Feature extraction; Shape; Speech; Training; Vectors; dynamic time warping; line spectral frequencies; multi-layer perceptron; radial basis function; residual selection; voice conversion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
  • Conference_Location
    New Delhi
  • Print_ISBN
    978-1-4799-3078-4
  • Type

    conf

  • DOI
    10.1109/ICACCI.2014.6968405
  • Filename
    6968405