• DocumentCode
    649480
  • Title

    Efficient range distribution query in large-scale scientific data

  • Author

    Chaudhuri, Arindam ; Teng-Yok Lee ; Han-Wei Shen ; Peterka, Tom

  • Author_Institution
    Ohio State Univ., Columbus, OH, USA
  • fYear
    2013
  • fDate
    13-14 Oct. 2013
  • Firstpage
    125
  • Lastpage
    126
  • Abstract
    Frequent access to raw data is no longer practical, if possible at all, for answering queries on large-scale data. This has led to the use of distribution-based data summaries, which can substitute for raw data to answer statistical queries of different kinds. Our work is concerned with range distribution query, which returns the distribution of an axis-aligned region of any size. We address the challenge of maintaining the interactivity and accuracy of such query results in the presence of large data. This work presents a novel and efficient framework for pre-computing and storing a set of distributions which can be used to query any arbitrary region during post-processing. We adapt an integral image based data structure to answer such queries in constant time, and propose a similarity-based encoding technique to reduce the storage cost of the data structure. Our scheme utilizes the similarity present among different regions in the data, and hence, their respective distributions. We demonstrate the use our technique in various applications, which directly or indirectly require distributions.
  • Keywords
    data structures; distributed processing; query processing; scientific information systems; statistical analysis; arbitrary region query; distribution-based data summaries; integral image based data structure; large-scale scientific data; query answering; range distribution query; raw data; similarity-based encoding technique; statistical queries; storage cost reduction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Large-Scale Data Analysis and Visualization (LDAV), 2013 IEEE Symposium on
  • Conference_Location
    Atlanta, GA
  • Type

    conf

  • DOI
    10.1109/LDAV.2013.6675171
  • Filename
    6675171