• DocumentCode
    138737
  • Title

    Computing entropy rate of symbol sources & a distribution-free limit theorem

  • Author

    Chattopadhyay, Ishanu ; Lipson, Hod

  • Author_Institution
    Cornell Univ., Ithaca, NY, USA
  • fYear
    2014
  • fDate
    19-21 March 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Entropy rate of sequential data-streams naturally quantifies the complexity of the generative process. Thus entropy rate fluctuations could be used as a tool to recognize dynamical perturbations in signal sources, and could potentially be carried out without explicit background noise characterization. However, state of the art algorithms to estimate the entropy rate have markedly slow convergence; making such entropic approaches non-viable in practice. We present here a fundamentally new approach to estimate entropy rates, which is demonstrated to converge significantly faster in terms of input data lengths, and is shown to be effective in diverse applications ranging from the estimation of the entropy rate of English texts to the estimation of complexity of chaotic dynamical systems. Additionally, the convergence rate of entropy estimates do not follow from any standard limit theorem, and reported algorithms fail to provide any confidence bounds on the computed values. Exploiting a connection to the theory of probabilistic automata, we establish a convergence rate of O(log|s|/3√|s|) as a function of the input length |s|, which then yields explicit uncertainty estimates, as well as required data lengths to satisfy pre-specified confidence bounds.
  • Keywords
    entropy; estimation theory; English texts; chaotic dynamical systems; convergence rate; distribution-free limit theorem; dynamical perturbations; entropy rate estimation; entropy rate fluctuations; explicit background noise characterization; explicit uncertainty estimates; generative process; prespecified confidence bounds; probabilistic automata; sequential data-streams; signal sources; standard limit theorem; symbol sources; Entropy; Estimation; Heuristic algorithms; Synchronization; Entropy rate; Probabilistic automata; Stochastic processes; Symbolic dynamics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Sciences and Systems (CISS), 2014 48th Annual Conference on
  • Conference_Location
    Princeton, NJ
  • Type

    conf

  • DOI
    10.1109/CISS.2014.6814175
  • Filename
    6814175