Title :
Computer aided recognition of pathological voice
Author :
Wahed, Manal Abdel
Author_Institution :
Fac. of Eng., Cairo Univ., Cairo, Egypt
Abstract :
Laryngeal diseases and vocal fold pathologies have strong impacts in the resulting quality of the voice production. Many approaches have been developed to analyze the acoustic parameters for the objective judgment of the pathological voice. The aim of this research is to propose a user friendly system for the discrimination between normal and diseased voice. The feature extraction technique has been applied on the voice signal in the time domain and in the frequency domain. Time domain features are: Zero Crossing Rate (ZCR) and Short time Energy. Frequency domain features are: Mel-Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC). Classification was based on threshold detection of each feature or group of features. The analysis resulted in the following conditions for normal voice signal: Energy mean > 0.07, ZCR Max <; 0.23, ZCR Mean [0.09:0.13], LPC [110:130] or [167:220], and finally MFCC [130:150]. The proposed system yielded the highest Accuracy of 90% with combining both ZCR Mean AND ZCR Max, highest sensitivity of 100% with ZCR Mean, and highest specificity of 97% with combining both ZCR Max AND MFCC, also with combining both ZCR Mean AND ZCR Max. The proposed method is quantitative and non-invasive, allowing the identification and monitoring of vocal system disorders, achieving early detection of laryngeal pathologies, and reducing the cost and time required for basic analysis.
Keywords :
acoustic signal processing; diseases; feature extraction; graphical user interfaces; human computer interaction; linear predictive coding; medical disorders; medical signal processing; patient monitoring; sensitivity; speech; speech processing; speech synthesis; time-frequency analysis; LPC; Mel-frequency cepstral coefficients; acoustic parameters; classification; computer aided recognition; feature extraction technique; frequency domain analysis; laryngeal diseases; linear predictive coding; normal voice signal; objective judgment; pathological voice; sensitivity; short time energy; threshold detection; time domain analysis; user friendly system; vocal fold pathologies; vocal system disorder identification; vocal system disorder monitoring; voice production; voice signal; zero crossing rate; Accuracy; Educational institutions; Feature extraction; Mel frequency cepstral coefficient; Pathology; Sensitivity and specificity; Software; disordered voice features; frequency domain features; pathological voice; time domain features;
Conference_Titel :
Radio Science Conference (NRSC), 2014 31st National
Conference_Location :
Cairo
Print_ISBN :
978-1-4799-3820-9
DOI :
10.1109/NRSC.2014.6835096