Title :
Multidimensional acoustic analysis for voice quality assessment based on the GRBAS scale
Author :
Ping Yu ; Zhijian Wang ; Shanshan Liu ; Nan Yan ; Lan Wang ; Manwa Ng
Author_Institution :
Dept. of Otorhinolaryngology Head & Neck Surg., Gen. Hosp., Beijing, China
Abstract :
Despite the fact that perceptual evaluation of voice quality is considered as a gold standard for examining normal and pathological voice quality, the considerably high inter- and intralisteners variability still cannot be neglected. This is the result of a number of confounding factors such as listeners´ perceptual bias, listeners´ experience and type of rating scale being used. Currently, automatic objective assessment provides a very useful tool for diagnosis of pathological voices. Acoustic analysis can be a useful complementary tool for determining severity of dysphania. The present study aimed to develop a complementary automatic assessment system for voice quality by using multidimensional acoustical measures based on the well-known GRBAS scale. A total of 65 dimensionality measures including Mel-frequency Cepstral Coefficients, Glottal-to-Noise Excitation Ratio, Vocal Fold Excitation Ratio were constituted a set of features. Additionally, to reduce redundancy of providing features, three different feature extraction techniques were applied. The multiclass classification was done by means of RBF kernel-SVM. The classification results were moderately correlated with GRBAS ratings of severity, with the best accuracy around 70%. This suggests that such multidimensional acoustic analysis can be an appropriate assessment tool in determining the presence and severity of voice disorders.
Keywords :
diseases; medical signal processing; patient diagnosis; radial basis function networks; speech processing; support vector machines; GRBAS scale; RBF kernel-SVM; automatic objective assessment; complementary automatic assessment system; dysphania; feature extraction techniques; glottal-to-noise excitation ratio; interlisteners variability; intralisteners variability; mel-frequency cepstral coefficients; multidimensional acoustic analysis; pathological voice quality; perceptual voice quality evaluation; vocal fold excitation ratio; voice disorders; voice quality assessment; Accuracy; Acoustic measurements; Classification algorithms; Noise; Pathology; Principal component analysis; Speech; GRBAS; acoustic analysis; automatic assessment; voice quality;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
DOI :
10.1109/ISCSLP.2014.6936628