Title :
Accurate parameter generation using fixed-point arithmetic for embedded HMM-based speech synthesizers
Author :
Nishizawa, Nobuyuki ; Kato, Tsuneo
Author_Institution :
KDDI R&D Labs. Inc., Saitama, Japan
Abstract :
Parameter trajectory generation for HMM-based speech synthesis is practically achieved using only fixed-point arithmetic with 32-bit integers. Since processors for embedded devices often provide no hardware-based floating-point number processor, a speech synthesizer using only fixed-point arithmetic is necessary for such devices. In this study, a new method to reduce rounding errors is introduced, as well as optimizing value scaling, and the generation of Fo trajectory is discussed. The experimental results indicated that RMSE in a logarithmic scale of Fo can be reduced down to approximately 0.04 semitones (1 semitone = 1/12 octaves) by the proposed method even where a 2-bit margin was arranged to avoid calculation overflow. An extension for trajectories considering the global variance (GV) using the basic program for trajectories without consideration of GV is also introduced. The extension method reduces required iteration counts to 5 for 0.05-semitone RMSE comparable to the converged results of the conventional method.
Keywords :
fixed point arithmetic; floating point arithmetic; hidden Markov models; speech synthesis; GV; RMSE; embedded HMM-based speech synthesizers; fixed-point arithmetic; global variance; hardware-based floating-point number processor; parameter trajectory generation; word length 32 bit; Computational efficiency; Equations; Hidden Markov models; Mathematical model; Speech; Speech synthesis; Trajectory; HMM-based speech synthesis; embedded devices; fixed-point arithmetic; global variance;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947403