Title :
Improving the intelligibility of dysarthric speech by modifying system parameters, retaining speaker´s identity
Author :
Saranya, M. ; Vijayalakshmi, P. ; Thangavelu, Nagarajan
Author_Institution :
Dept. of Electron. & Commun. Eng., SSN Coll. of Eng., Chennai, India
Abstract :
Dysarthria is a neuromotor impairment of speech that affects one or more subsystems involved in speech production. Such impairment is reflected in the acoustic characteristics of phonemes uttered by a dysarthric speaker. If such a speaker suffers from laryngeal dysfunction and improper articulation, then he/she may not be able to utter some/most of the phonemes properly. In our work, from the utterance of a dysarthric speaker, the poorly uttered phonemes are located and replaced with that of the normal speaker´s speech signal. However, the resultant speech signal after concatenation doesn´t sound natural due to the discontinuities, at the concatenation points in short-term energy, pitch period, and formant contour. In our work, the discontinuity at the concatenation point, in the short-term energy function is handled by smoothening the short-term energy of few frames before and after the concatenation point. Since, the pitch period in the replaced segment (phoneme) is considerably different from the dysarthric speaker´s pitch period, the pitch period is adjusted to resemble the dysarthric speaker. The quality and naturalness of the utterance, after pitch modification, are considerably increased. The discontinuity in the formant contour is due to the reason that the co-articulation effect is absent since the replaced unit is taken from a different context. From the linear prediction analysis, the pole locations and their corresponding radii are adjusted based on the pole locations of adjacent phonemes. The quality and naturalness of speech signal, after all the three modifications, are found to be very close to the natural speech.
Keywords :
speaker recognition; speech processing; articulation; co-articulation effect; dysarthria neuromotor impairment; dysarthric speaker identity; dysarthric speaker utterance; dysarthric speech intelligibility; formant contour; laryngeal dysfunction; linear prediction analysis; normal speaker speech signal; phonemes; pitch period; short-term energy; speech signal naturalness; speech signal quality; system parameter; Acoustics; Context; Estimation; Natural languages; Resonant frequency; Shape; Speech; Dysarthria; formant contour; improper articulation; intelligibility modification; laryngeal dysfunction; pitch contour; short-term energy; speaker identity;
Conference_Titel :
Recent Trends In Information Technology (ICRTIT), 2012 International Conference on
Conference_Location :
Chennai, Tamil Nadu
Print_ISBN :
978-1-4673-1599-9
DOI :
10.1109/ICRTIT.2012.6206799