• DocumentCode
    2018275
  • Title

    A time-frequency segmental neural network for phoneme recognition

  • Author

    Basu, Anjan ; Svendsen, Torbjorn

  • Author_Institution
    Dept. of Telecommun., Norwegian Inst. of Technol., Trondheim, Norway
  • Volume
    1
  • fYear
    1993
  • fDate
    27-30 April 1993
  • Firstpage
    509
  • Abstract
    The authors propose a time-frequency segmental neural network (TFSNN) which classifies phonemes according to the two-dimensional time frequency distribution of the whole phonetic segment. It uses a network architecture similar to those used for optical character recognition (OCR) to provide local shift invariance along both the time and the frequency axis. The TFSNN can be used in place of a segmental neural network (SNN) in a hybrid hidden Markov model (HMM) artificial neural network (ANN) system for automatic speech recognition as it shows significantly better performance than the SNN. The training times for the TFSNN is also smaller as it employs very few connection weights compared with the SNN.<>
  • Keywords
    hidden Markov models; learning (artificial intelligence); neural nets; speech recognition; time-frequency analysis; automatic speech recognition; connection weights; hidden Markov model; local shift invariance; network architecture; performance; phoneme recognition; time-frequency segmental neural network; training times;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
  • Conference_Location
    Minneapolis, MN, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.1993.319167
  • Filename
    319167