DocumentCode
1237521
Title
On the File Design Problem for Partial Match Retrieval
Author
Du, Hung-Chang
Author_Institution
Department of Computer Science, University of Minnesota
Issue
2
fYear
1985
Firstpage
213
Lastpage
222
Abstract
In the past two decades, the increasing usage of databases and integrated information systems has encouraged the development of file structures suited for partial match retrieval. A partial match query is a query with some number of attributes specified and the rest of them unspecified. One interesting file structure proposed and heavily studied recently is called a multikey hashing scheme, but most of the previous results on designing optimal multikey hashing schemes ignored the record distribution of a file. In this paper we show that the problem of designing an optimal multikey hashing scheme taking into consideration the record distribution is computationally intractable (NP-hard). Therefore, a heuristic approach is necessary. In a multikey hashing scheme, although the directory is space efficient and the search algorithm is fast, due to the insufficient information in the directory some accessed buckets may not contain any record satisfying the given query. Thus, certain retrieval effort is wasted. A new class of file structures which combine a multikey hashing scheme and an indexed descriptor technique is introduced in this paper. By adding some extra information (either record descriptors or bucket descriptors) into the directory of a multikey hashing scheme, either only those buckets which contain at least one record satisfying the given query need to be accessed or the number of accessed buckets which do not contain any record satisfying the query is reduced.
Keywords
Database design; NP-hard; file structures; multikey hashing; partial match retrieval; Computer science; Database systems; Distributed computing; Information retrieval; Information systems; Magnetic devices; Terminology; Database design; NP-hard; file structures; multikey hashing; partial match retrieval;
fLanguage
English
Journal_Title
Software Engineering, IEEE Transactions on
Publisher
ieee
ISSN
0098-5589
Type
jour
DOI
10.1109/TSE.1985.232197
Filename
1701990
Link To Document