Title :
Recognition of 24 Thai spoken Vowels Using the coefficients of 3rdOrder Polynomial Regression on the Voice Energy and Spectrum of LPC on the Bark Scale
Author :
Songwatana, K. ; Sriratanapaprat, S. ; Kultap, P. ; Sittiprasert, K. ; Suktangman, N.
Author_Institution :
Faculty of Engineering, King Mongkut´´s Institute of Technology, Ladkrabang, 3-2 Chalongkrung Road, Ladkrabang, Bangkok 10520, E-mail: kskraisi@kmitl.ac.th
Abstract :
This paper presents a vowel recognition for Thai spoken language. The Thai language consists of 9 short unmixed vowels (a, i,ω,u, o, e, ε, γ, [unk]); 9 long unmixed vowels (aa, ii, ωω, uu, oo, ee, £ εε, γγ, [unk][unk]); 3 short mixed vowels (ia, ωa, ua); and 3 long mixed vowels (i:a:, ω:a:, u:a:). We proposed uses 3-stage decision making: step 1 distinguishes long and short vowels using coefficients of third order polynomial regression of signal energy as features set and 5-NN as classification method; step 2 classifies each voice segment (frame) into 9 basic vowels using 18 critical band intensities as feature set and 9-NN as classification method; finally step 3 decides whether each frame contains mixed or unmixed vowel via thresholding method. This solution is different from the conventional speech recognition mainly because decision making in this method is done for each frame, while conventional speech recognition chooses the best decision for a sequence of frames forming a word or a sentence. Evaluation is done by applying the algorithm to 3024 voice samples of male and female subjects. Each step of the algorithm is evaluated successively.
Keywords :
Decision making; Ear; Equations; Humans; Linear predictive coding; Lungs; Natural languages; Polynomials; Signal processing; Speech recognition;
Conference_Titel :
Wireless Pervasive Computing, 2006 1st International Symposium on
Print_ISBN :
0-7803-9410-0
DOI :
10.1109/ISWPC.2006.1613615