HNM parameter transform for voice conversion using a HMM-WDLT framework

Author

Hu, H.T. ; Yu, C. ; Lin, C.H.

Author_Institution

Dept. of Electron. Eng., Nat. I-Lan Univ., I-Lan, Taiwan

Volume

2

fYear

2010

fDate

30-31 May 2010

Firstpage

282

Lastpage

287

Abstract

This paper presents a Harmonic + Noise Model (HNM)-based voice conversion technique under a Hidden Markov Model-Weighted Deviation Linear Transformation (HMM-WDLT) framework. In a comparative study of four methods of converting the extracted line spectral frequency (LSF) parameters of one speaker to another, the HMM-WDLT achieves the lowest average spectral distortion. A remedial process is developed to enhance the formant characteristics while preserving the variance of the LSF parameters. The frame duration, manifested by the slope of the dynamic time warping (DTW) path, is regarded as an output variable of the conversion function. To take full advantage of the attributes of the HNM, the conversion algorithm fine-tunes the harmonic magnitudes below 2 kHz for each critical band. Listening test reveal that the converted speech successfully catches the speaker´s individuality with satisfactory quality.

Keywords

Acoustic noise; Electronics industry; Filters; Frequency; Hidden Markov models; Loudspeakers; Mechatronics; Natural languages; Speech enhancement; Speech synthesis; Harmonic + noise model; Voice Conversion; hidden Markov model - weighted deviation linear transform;

fLanguage

English

Publisher

ieee

Conference_Titel

Industrial Mechatronics and Automation (ICIMA), 2010 2nd International Conference on

Conference_Location

Wuhan, China

Print_ISBN

978-1-4244-7653-4

Type

conf

DOI

10.1109/ICINDMA.2010.5538313

Filename

5538313