• DocumentCode
    323835
  • Title

    A neural architecture for computing acoustic-phonetic invariants

  • Author

    Tsiang, Elaine

  • Author_Institution
    Monowave Corp., Seattle, WA, USA
  • Volume
    2
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    1109
  • Abstract
    The proposed neural architecture consists of an analytic lower net, and a synthetic upper net. This paper focuses on the upper net. The lower net performs a 2D multiresolution wavelet decomposition of an initial spectral representation to yield a multichannel representation of local frequency modulations at multiple scales. From this representation, the upper net synthesizes increasingly complex features, resulting in a set of acoustic observables at the top layer with multiscale context dependence. The upper net also provides for invariance under frequency shifts, dilatations in tone intervals and time intervals, by building these transformations into the architecture. Application of this architecture to the recognition of gross and fine phonetic categories from continuous speech of diverse speakers shows that it provides high accuracy and strong generalization from modest amounts of training data
  • Keywords
    acoustic signal processing; feature extraction; frequency modulation; learning (artificial intelligence); neural net architecture; signal representation; signal resolution; spectral analysis; speech recognition; wavelet transforms; 2D multiresolution wavelet decomposition; acoustic observables; acoustic-phonetic invariants; analytic lower net; complex features synthesis; continuous speech recognition; dilatations; fine phonetic categories; frequency shifts; gross phonetic categories; high accuracy; local frequency modulation; multichannel representation; multiple scales; multiscale context dependence; neural architecture; spectral representation; synthetic upper net; time intervals; tone intervals; training data; Buildings; Computer architecture; Costs; Frequency modulation; Hidden Markov models; Neural networks; Robustness; Spectral analysis; Speech recognition; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.675463
  • Filename
    675463