• DocumentCode
    1140149
  • Title

    Estimating the information content of symbol sequences and efficient codes

  • Author

    Grassberger, Peter

  • Author_Institution
    Dept. of Phys., Wuppertal Univ., West Germany
  • Volume
    35
  • Issue
    3
  • fYear
    1989
  • fDate
    5/1/1989 12:00:00 AM
  • Firstpage
    669
  • Lastpage
    675
  • Abstract
    Several variants of an algorithm for estimating Shannon entropies of symbol sequences are presented. They are all related to the Lempel-Ziv algorithm (1976, 1977) and to recent algorithms for estimating Hausdorff dimensions. The average storage and running times increase as N and Nlog N, respectively, with the sequence length N. These algorithms proceed basically by constructing efficient codes. They seem to be the optimal algorithms for sequences with strong long-range correlations, e.g. natural languages. An application to written English illustrates their use
  • Keywords
    FORTRAN listings; binary sequences; codes; entropy; information theory; Hausdorff dimensions; Lempel-Ziv algorithm; Shannon entropy; efficient codes; information content; strong long-range correlations; symbol sequences; written English; Binary codes; Binary sequences; Disk recording; Entropy; Gaussian processes; Information theory; Natural languages; Optical recording; Physics; Probability;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/18.30993
  • Filename
    30993