DocumentCode :
3047145
Title :
Reversing the error-correction scheme for a fault-tolerant indexing
Author :
Berkovich, Simon ; El-Qawasmeh, Eyas
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., George Washington Univ., Washington, DC, USA
fYear :
1998
fDate :
30 Mar-1 Apr 1998
Firstpage :
527
Abstract :
Summary form only given. The article presents an innovative approach to approximate matching of multi-attribute objects based on reversing the conventional scheme of error-correction coding. The approximate matching problem primarily arises in information retrieval systems, which can store fuzzily described items and operate with nebulous searching criteria. To establish an approximate equivalence relation on a set of multi-attribute objects it has been suggested to apply a decoding procedure to binary vectors corresponding to these objects and to use the obtained message words as hash codes. With this hashing technique it is possible to construct “fault-tolerant” indices allowing certain mismatches of binary vectors in terms of Hamming metrics. The simplest practical realization of this technique is based on the so-called perfect Golay code which maps 23-bit vectors into 12-bit message words. In this case, two different 23-bit vectors at a Hamming distance of 2 would have some common 12-bit indices. This provides an organization of a direct retrieval of a neighborhood of 23 bit-vectors with up to two mismatches from a given key. The proposed technique employs a reasonable redundancy and can trade utilization of extra memory for the speed and range of searching. Besides a direct application to information retrieval, the developed technique is also beneficial for complex computational procedures incorporating near-matching operations. A typical procedure of this kind is recovering of closed matches from vector-quantization tables
Keywords :
Golay codes; cryptography; decoding; error correction codes; file organisation; indexing; information retrieval systems; vector quantisation; 12 bit; 23 bit; Hamming metrics; approximate equivalence relation; approximate matching; binary vectors; data compression; decoding; error-correction coding; fault-tolerant indexing; fault-tolerant indices; files; fuzzily described items; hash codes; information retrieval systems; message words; multi-attribute objects; perfect Golay code; redundancy; searching criteria; speech recognition; vector-quantization tables; Decoding; Fault tolerance; File systems; Hamming distance; Indexing; Information retrieval; Production; Redundancy; Signal processing; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 1998. DCC '98. Proceedings
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
0-8186-8406-2
Type :
conf
DOI :
10.1109/DCC.1998.672237
Filename :
672237
Link To Document :
بازگشت