• DocumentCode
    177995
  • Title

    Multimodal voice conversion using non-negative matrix factorization in noisy environments

  • Author

    Masaka, Kenta ; Aihara, Ryo ; Takiguchi, Tetsuya ; Ariki, Yasuo

  • Author_Institution
    Grad. Sch. of Syst. Inf., Kobe Univ., Kobe, Japan
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    1542
  • Lastpage
    1546
  • Abstract
    This paper presents a multimodal voice conversion (VC) method for noisy environments. In our previous NMF-based VC method, source exemplars and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars obtained from the input signal, and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. In this paper, we propose a multimodal VC that improves the noise robustness in our NMF-based VC method. By using the joint audio-visual features as source features, the performance of VC is improved compared to a previous audio-input NMF-based VC method. The effectiveness of this method was confirmed by comparing its effectiveness with that of a conventional Gaussian Mixture Model (GMM)-based method.
  • Keywords
    matrix decomposition; speech recognition; NMF-based VC method; joint audio-visual features; multimodal voice conversion method; nonnegative matrix factorization; parallel training data; source exemplars; target exemplars; Dictionaries; Feature extraction; Noise; Noise measurement; Speech; Speech recognition; Visualization; image features; multimodal; noise robustness; non-negative matrix factorization; voice conversion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6853856
  • Filename
    6853856