DocumentCode
1086096
Title
A description of a parametrically controlled modular structure for speech processing
Author
Dixon, N. ; Silverman, Harvey F.
Author_Institution
IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y.
Volume
23
Issue
1
fYear
1975
fDate
2/1/1975 12:00:00 AM
Firstpage
87
Lastpage
91
Abstract
A system, the modular acoustic processor (MAP) consisting of two major components, has been designed for work in speech recognition. A versatile spectral analysis system, the parametrically controlled analyzer (PCA), serves as input to an hierarchically operated string transcriber (HOST). In the design of this system, controllability and modularity for developmental extensibility were primary concerns. The system, with the exception of initial high-fidelity, direct A/D conversion, is entirely implemented in software, PL/I, with appropriate JCL structures for running under OS/MVT on an IBM 360-91. As an adjunct for obtaining training data, a grayscale interactive system using an IBM 1800 process-control computer has also been implemented. PCA signal processing features parametric selection of several analysis methods, including discrete Fourier transform (DFT), linear predictive coding (LPC), and chirp z-transform (CZT). Also, selection may be made among various smoothing, normalization, interpolation, and F0 estimation methods. PCA develops high-quality spectrographic representations of speech for standard line printers, CRT display, and subsequent processing. PCA also performs spectral-similarity matching and training. HOST consists of a number of processes for performing segmentation, classification, and prosody analysis. Provision is made for complete commutability at the module level as well as at the algorithm level. The segmentation/classification output of HOST is augmented by estimates of confidence. PCA is a packaged, debugged, running system. A first version of HOST is operational.
Keywords
Control system analysis; Control systems; Controllability; Discrete Fourier transforms; Linear predictive coding; Operating systems; Principal component analysis; Spectral analysis; Speech processing; Speech recognition;
fLanguage
English
Journal_Title
Acoustics, Speech and Signal Processing, IEEE Transactions on
Publisher
ieee
ISSN
0096-3518
Type
jour
DOI
10.1109/TASSP.1975.1162630
Filename
1162630
Link To Document