• DocumentCode
    1327637
  • Title

    Temporal difference learning applied to sequential detection

  • Author

    Guo, Chengan ; Kuh, Anthony

  • Author_Institution
    Dept. of Electr. Eng., Hawaii Univ., Honolulu, HI, USA
  • Volume
    8
  • Issue
    2
  • fYear
    1997
  • fDate
    3/1/1997 12:00:00 AM
  • Firstpage
    278
  • Lastpage
    287
  • Abstract
    This paper proposes a novel neural-network method for sequential detection, We first examine the optimal parametric sequential probability ratio test (SPRT) and make a simple equivalent transformation of the SPRT that makes it suitable for neural-network architectures. We then discuss how neural networks can learn the SPRT decision functions from observation data and labels. Conventional supervised learning algorithms have difficulties handling the variable length observation sequences, but a reinforcement learning algorithm, the temporal difference (TD) learning algorithm works ideally in training the neural network. The entire neural network is composed of context units followed by a feedforward neural network. The context units are necessary to store dynamic information that is needed to make good decisions. For an appropriate neural-network architecture, trained with independent and identically distributed (iid) observations by the TD learning algorithm, we show that the neural-network sequential detector can closely approximate the optimal SPRT with similar performance. The neural-network sequential detector has the additional advantage that it is a nonparametric detector that does not require probability density functions. Simulations demonstrated on iid Gaussian data show that the neural network and the SPRT have similar performance
  • Keywords
    Gaussian distribution; feedforward neural nets; learning (artificial intelligence); neural net architecture; nonparametric statistics; statistical analysis; context units; decision functions; dynamic information; equivalent transformation; feedforward neural network; iid Gaussian data; independent and identically distributed observations; labels; nonparametric detector; observation data; optimal parametric sequential probability ratio test; reinforcement learning algorithm; sequential detection; temporal difference learning; Density functional theory; Detectors; Feedforward neural networks; Neural networks; Parametric statistics; Probability density function; Sequential analysis; Signal processing algorithms; Supervised learning; Testing;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/72.557666
  • Filename
    557666