Title :
A pitch-synchronous digital feature extraction system for phonemic recognition of speech
Author :
Hess, Wolfgang J.
Author_Institution :
Technische Universitaet Muenchen, Munich, Germany
fDate :
2/1/1976 12:00:00 AM
Abstract :
The system described in this paper is subdivided into three main steps: pitch extraction, segmentation, and formant analysis. The pitch extractor uses an adaptive digital filter in time-domain transforming the speech signal into a signal similar to the glottal waveform. Using the levels of the speech signal and the differenced signal as parameters in time domain, the subsequent segmentation algorithm derives a signal parameter which describes the speed of articulatory movement. From this, the signal is divided into "stationary" and "\´transitional" segments; one stationary segment is associated to one phoneme. For the formant tracking procedure, a subset of the pitch periods is selected by the segmentation algorithm and is transformed into frequency domain. The formant tracking algorithm uses a maximum detection strategy and continuity criteria for adjacent spectra. After this step, the total parameter set is offered to an adaptive universal pattern classifier which is trained by selected material before working. For stationary phonemes, the recognition rate is about 85 percent when training material and test material are uttered by the same speaker. The recognition rate is increased to about 90 percent when segmentation results are used.
Keywords :
Data mining; Feature extraction; Frequency; Humans; Redundancy; Signal processing; Signal processing algorithms; Speech analysis; Speech processing; Speech recognition;
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on
DOI :
10.1109/TASSP.1976.1162771