Title :
Multilevel filtering for high-dimensional image data: why and how
Author :
Ng, Raymond T. ; Tam, Dominic
Author_Institution :
Dept. of Comput. Sci., British Columbia Univ., Vancouver, BC, Canada
Abstract :
It has been shown that filtering is a promising way to support efficient content-based retrieval from image data. However, all existing studies on filtering restrict their attention to two levels. We consider filtering structures that have at least three levels. In the first half of the paper, by analyzing the CPU and I/O costs of various structures, we provide analytic evidence on why three-level structures can often outperform corresponding two-level ones. We provide further experimental results showing that the three-level structures are typically the best, and can beat the two-level ones by a wide margin. In the second half of the paper, we study how to find the (near-) optimal three-level structure for a given dataset. We develop an optimizer that can handle this task effectively and efficiently. Experimental results indicate that in tens of seconds of CPU time, the optimizer can find a filtering structure whose total runtime per query exceeds that of the real optimal structure by only 2-3 percent
Keywords :
content-based retrieval; database theory; query processing; visual databases; CPU; content-based retrieval; dataset; experimental results; high-dimensional image data; input output costs; multilevel filtering; runtime per query; three-level structures; Biomedical imaging; Costs; Feature extraction; Filtering; Health information management; Histograms; Image databases; Indexing; Multidimensional systems; Visual databases;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on