• DocumentCode
    1086096
  • Title

    A description of a parametrically controlled modular structure for speech processing

  • Author

    Dixon, N. ; Silverman, Harvey F.

  • Author_Institution
    IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y.
  • Volume
    23
  • Issue
    1
  • fYear
    1975
  • fDate
    2/1/1975 12:00:00 AM
  • Firstpage
    87
  • Lastpage
    91
  • Abstract
    A system, the modular acoustic processor (MAP) consisting of two major components, has been designed for work in speech recognition. A versatile spectral analysis system, the parametrically controlled analyzer (PCA), serves as input to an hierarchically operated string transcriber (HOST). In the design of this system, controllability and modularity for developmental extensibility were primary concerns. The system, with the exception of initial high-fidelity, direct A/D conversion, is entirely implemented in software, PL/I, with appropriate JCL structures for running under OS/MVT on an IBM 360-91. As an adjunct for obtaining training data, a grayscale interactive system using an IBM 1800 process-control computer has also been implemented. PCA signal processing features parametric selection of several analysis methods, including discrete Fourier transform (DFT), linear predictive coding (LPC), and chirp z-transform (CZT). Also, selection may be made among various smoothing, normalization, interpolation, and F0estimation methods. PCA develops high-quality spectrographic representations of speech for standard line printers, CRT display, and subsequent processing. PCA also performs spectral-similarity matching and training. HOST consists of a number of processes for performing segmentation, classification, and prosody analysis. Provision is made for complete commutability at the module level as well as at the algorithm level. The segmentation/classification output of HOST is augmented by estimates of confidence. PCA is a packaged, debugged, running system. A first version of HOST is operational.
  • Keywords
    Control system analysis; Control systems; Controllability; Discrete Fourier transforms; Linear predictive coding; Operating systems; Principal component analysis; Spectral analysis; Speech processing; Speech recognition;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/TASSP.1975.1162630
  • Filename
    1162630