Title :
Capacity/Storage Tradeoff in High-Dimensional Identification Systems
Author_Institution :
Dept. of Electr. Eng., Univ. of California, Riverside, CA
fDate :
5/1/2009 12:00:00 AM
Abstract :
The asymptotic tradeoff between the number of distinguishable objects and the necessary storage space (or equivalently, the search complexity) in an identification system is investigated. In the discussed scenario, high-dimensional (and noisy) feature vectors extracted from objects are first compressed and then enrolled in the database. When the user submits a random query object, the extracted noisy feature vector is compared against the compressed entries, one of which is output as the identified object. The first result this paper presents is a complete single-letter characterization of achievable storage and identification rates (measured in bits per feature dimension) subject to vanishing probability of identification error as the dimensionality of feature vectors becomes very large. This single-letter characterization is then extended for a multistage system whereby depending on the number of entries, the identification is performed by utilizing part or all of the recorded bits in the database. Finally, it is shown that a necessary and sufficient condition for a two-stage system to achieve single-stage capacities at each stage is Markovity of the optimal test channels.
Keywords :
Markov processes; storage management; visual databases; Markovity; capacity-storage tradeoff; database; extracted noisy feature vector; high-dimensional identification systems; multistage system; random query object; single-letter characterization; storage space; Feature extraction; Impedance; Indexing; Information retrieval; Information theory; Random access memory; Spatial databases; Statistics; Sufficient conditions; System testing; Capacity; databases; identification systems; successive refinement;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2009.2016057