Title :
Compression of graphical structures
Author :
Choi, Yongwook ; Szpankowski, Wojciech
Author_Institution :
Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
fDate :
June 28 2009-July 3 2009
Abstract :
F. Brooks argues in there is ldquono theory that gives us a metric for information embodied in structurerdquo Shannon himself alluded to it fifty years earlier in his little known 1953 paper. Indeed, in the past information theory dealt mostly with ldquoconventional data,rdquo be it textual data, image, or video data. However, databases of various sorts have come into existence in recent years for storing ldquounconventional datardquo including biological data, Web data, topographical maps, and medical data. In compressing such data structures, one must consider two types of information: the information conveyed by the structure itself, and the information conveyed by the data labels implanted in the structure. In this paper, we attempt to address the former problem by studying information of graphical structures (i.e., unlabeled graphs). In particular, we consider the Erdos-Renyi graphs G(n; p) over n vertices in which edges are added randomly with probability p. We prove that the structural entropy of G(n; p) is (2 n)h(p) - log n! + o(1) = (2 n)h(p) - n log n + O(n); where h(p) = -p log p - (1 - p) log(1 - p) is the entropy rate of a conventional memoryless binary source. Then, we design a two-stage encoding that optimally compresses unlabeled graphs up to the first two leading terms of the structural entropy.
Keywords :
data structures; entropy codes; graph theory; probability; source coding; Erdos-Renyi graphs; Web data; biological data; conventional memoryless binary source; data structure compression; databases; graphical structure compression; information theory; medical data; probability; structural entropy; topographical maps; two-stage encoding; Algorithm design and analysis; Biological information theory; Biomedical imaging; Computer science; Data structures; Entropy; Image coding; Image databases; Information theory; Video compression;
Conference_Titel :
Information Theory, 2009. ISIT 2009. IEEE International Symposium on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-4312-3
Electronic_ISBN :
978-1-4244-4313-0
DOI :
10.1109/ISIT.2009.5205736