• DocumentCode
    1362435
  • Title

    Materialization and Decomposition of Dataspaces for Efficient Search

  • Author

    Song, Shaoxu ; Chen, Lei ; Yuan, Mingxuan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Kowloon, China
  • Volume
    23
  • Issue
    12
  • fYear
    2011
  • Firstpage
    1872
  • Lastpage
    1887
  • Abstract
    Dataspaces consist of large-scale heterogeneous data. The query interface of accessing tuples should be provided as a fundamental facility by practical dataspace systems. Previously, an efficient index has been proposed for queries with keyword neighborhood over dataspaces. In this paper, we study the materialization and decomposition of dataspaces, in order to improve the query efficiency. First, we study the views of items, which are materialized in order to be reused by queries. When a set of views are materialized, it leads to select some of them as the optimal plan with the minimum query cost. Efficient algorithms are developed for query planning and view generation. Second, we study the partitions of tuples for answering top-k queries. Given a query, we can evaluate the score bounds of the tuples in partitions and prune those partitions with bounds lower than the scores of top-k answers. We also provide theoretical analysis of query cost and prove that the query efficiency cannot be improved by increasing the number of partitions. Finally, we conduct an extensive experimental evaluation to illustrate the superior performance of proposed techniques.
  • Keywords
    data handling; query processing; dataspace decomposition; dataspace materialization; large-scale heterogeneous data; query efficiency; query interface; query planning; Image color analysis; Indexes; Keyword search; Query processing; Search methods; information retrieval; Dataspaces; decomposition.; materialization;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.213
  • Filename
    5611525