مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

3226552

Title :

DCA Using Suffix Arrays

Author :

Fiala, Martin ; Holub, Jan

Author_Institution :

Czech Tech. Univ. in Prague, Prague

fYear :

2008

fDate :

25-27 March 2008

Firstpage :

516

Lastpage :

516

Abstract :

DCA (Data Compression using Antidictionaries) is a novel lossless data compression method working on bit streams presented by Crochemore et al. DCA takes advantage of words that do not occur as factors in the text, i.e. that are forbidden. Due to these forbidden words (antiwords), some symbols in the text can be predicted. We build the antidictionary using suffix array in time O(k * N log N), where k is maximal antiword length. Length of suffix array and LCP constructed over the binary alphabet will be 8 times length of the input text. Still memory requirements for suffix array and LCP construction depend only on the length N of input text with O(N), instead of suffix trie with exponential complexity.

Keywords :

computational complexity; data compression; data structures; text analysis; exponential complexity; lossless data compression method; maximal antiword length; suffix array; suffix trie; text symbol prediction; time complexity; Compressors; Computer science; Data compression; Data engineering; Encoding; Optical arrays; Transducers; Data Compression using Antidictionaries; suffix array; suffix trie;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Compression Conference, 2008. DCC 2008

Conference_Location :

Snowbird, UT

ISSN :

1068-0314

Print_ISBN :

978-0-7695-3121-2

Type :

conf

DOI :

10.1109/DCC.2008.95

Filename :

4483343

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3226552