DocumentCode
2941322
Title
Random access from compressed datasets with perfect value hashing
Author
Miller, John M.
Author_Institution
Microsoft Corp., Redmond, WA, USA
fYear
1995
fDate
17-22 Sep 1995
Firstpage
454
Abstract
A representation technique is presented allowing for quick access of individual records from a static compressed dataset. Given a collection of key-record pairs, the representation allows the appropriate short record to be returned for any given key. The approach is a generalization of perfect address hashing. The new approach, called perfect value hashing, uses a carefully chosen pseudo-random number generator to directly produce the correct record for any key in the dataset. This contrasts with address hashing where the random number provides an address which is then used to recover the record from a separate table. Value hashing doesn´t have the theoretical limitations of address hashing, and in practice is more space efficient for records of size less than 36 bits. Value hashing has the added benefit (important when the records are encoded for compression) that variable length records can be represented without an increase in the size of the encoded records. This new technique was used to provide random access from a highly compressed spelling dictionary
Keywords
data compression; data structures; encoding; random number generation; random processes; compressed datasets; compressed spelling dictionary; encoded records; individual records access; key-record pairs; perfect address hashing; perfect value hashing; pseudorandom number generator; random access; representation technique; variable length records; Books; Broadcasting; Cost function; Databases; Dictionaries;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Theory, 1995. Proceedings., 1995 IEEE International Symposium on
Conference_Location
Whistler, BC
Print_ISBN
0-7803-2453-6
Type
conf
DOI
10.1109/ISIT.1995.550441
Filename
550441
Link To Document