Title :
Effectiveness of fractal dimension for ASR in low resource language
Author :
Zaki, Mohammadi ; Shah, N.J. ; Patil, Hemant A.
Author_Institution :
Dhirubhai Ambani Inst. of Inf. & Commun. Technol. (DA-IICT), Gandhinagar, India
Abstract :
We propose to use multiscale fractal dimension (MFD) as components of feature vectors for automatic speech recognition (ASR) especially in low resource languages. Speech, which is known to be a nonlinear process, can be efficiently represented by extracting some nonlinear properties, such as fractal dimension, from the speech segment. During speech production, vortices (generated due to presence of separated airflow) may travel along the vocal tract and excite vocal tract resonators at the epiglottis, velum, palate, teeth, lips, etc. By Kolmogorov´s law, the gradient in energy levels between these vortices produces turbulence. This ruggedness, and in effect, the embedded features of different phoneme classes, can be captured by invariant property of FD. Furthermore, speech is a multifractal, which justifies the use of multiscale fractal dimension as feature components for speech. In this paper, we describe the multifractal nature of speech signal and use this property for automatic phonetic segmentation task. The results show a significant decrease in % EER (≈ 4.2 % from traditional MFCC base features and ≈ 2.5 % from MFCC appended with 1-D fractal dimension). The DET curves clearly show improvement in the performance with the new multiscale fractal dimension-based features for low resource language under consideration.
Keywords :
speech recognition; vectors; ASR; DET curve; Kolmogorov law; MFCC base features; automatic phonetic segmentation; automatic speech recognition; energy level; feature vector; low resource language; multiscale fractal dimension; nonlinear property; phoneme class; speech production; vocal tract resonator; vortices; Feature extraction; Fractals; Mel frequency cepstral coefficient; Production; Speech; Speech processing; Vectors; Automatic phonetic segmentation; multifractal; multiscale fractal dimension; nonlinearities;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
DOI :
10.1109/ISCSLP.2014.6936645