Title :
Vocal tract length normalization for vowel recognition in low resource languages
Author :
Sharma, Shantanu ; Madhavi, Maulik C. ; Patil, Hemant A.
Author_Institution :
Dhirubhai Ambani Inst. of Inf. & Commun. Technol. (DA-IICT), Gandhinagar, India
Abstract :
Vocal Tract Length Normalization (VTLN) is used to design vocal tract length normalized Automatic Speech Recognition (ASR) systems. It has led to improvement in the performance of ASR systems by taking into account the physiological differences among speakers. Recently, a number of speech recognition applications are being developed for Indian languages. In this paper, we use state-of-the-art method for VTLN based on maximum likelihood approach. A vowel recognition system has been developed for two low resourced Indian languages, viz., Gujarati and Marathi. Appropriate warping factors have been obtained for all speakers considered for training and testing procedures. An improvement in the performance of vowel recognition is observed as compared to state-of-the-art Mel Frequency Cepstral Coefficients (MFCC).
Keywords :
cepstral analysis; maximum likelihood estimation; natural language processing; speech recognition; ASR systems; Gujarati; Indian languages; MFCC; Marathi; Mel frequency cepstral coefficients; VTLN; automatic speech recognition; low resource languages; maximum likelihood approach; vocal tract length normalization; vowel recognition system; warping factors; Databases; Filter banks; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Testing; Training; Lee-Rose method; VTLN; speech recognition; warping factor;
Conference_Titel :
Asian Language Processing (IALP), 2014 International Conference on
Conference_Location :
Kuching
DOI :
10.1109/IALP.2014.6973516