DocumentCode
3384754
Title
Phrase elimination in greedy parsing dictionary coders with deferred innovation
Author
Yao, Zhen
Author_Institution
Dept. of Comput. Sci., Warwick Univ., Coventry, UK
fYear
2003
fDate
25-27 March 2003
Firstpage
456
Abstract
Summary form only given. LZ dictionary coders parse the message into successive substrings, each consists two parts, the citation, the longest prefix phrase that has already been accommodated in the dictionary, and the innovation, the symbol immediately following the citation. Suppose the input alphabet set is A and the dictionary D = {p1, p2...pn} is a set of phrases where pi∈A*, parsed by a greedy-parsing LZ coder. Represented in the form of a dictionary search tree, the process matching a phrase in D with a citation can be viewed as traversing from the root of the dictionary tree by matching consecutive symbols from the input until the mismatching innovation occurs. The dictionary is reduced to D´=D/E. Its phrase index is then encoded by a less redundant code (LRC) with upper bound of codeword length. The expected number of phrases in D´ was estimated. It was also verified with experiments that such estimation is accurate. It was also shown that 3% improvement is typical over LZW coders with LRC and 5% better than standard LZW.
Keywords
codes; data compression; grammars; search problems; string matching; LRC; citation; codeword length; deferred innovation; greedy parsing dictionary coders; less redundant code; longest prefix phrase; phrase elimination; Code standards; Compressors; Computer science; Data compression; Dictionaries; Image coding; Impedance matching; Technological innovation;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 2003. Proceedings. DCC 2003
ISSN
1068-0314
Print_ISBN
0-7695-1896-6
Type
conf
DOI
10.1109/DCC.2003.1194075
Filename
1194075
Link To Document