• DocumentCode
    3085482
  • Title

    Cost-based Optimization of Complex Scientific Queries

  • Author

    Fomkin, Ruslan ; Risch, Tore

  • Author_Institution
    Uppsala Univ., Uppsala
  • fYear
    2007
  • fDate
    9-11 July 2007
  • Firstpage
    1
  • Lastpage
    1
  • Abstract
    High energy physics scientists analyze large amounts of data looking for interesting events when particles collide. These analyses are easily expressed using complex queries that filter events. We developed a cost model for aggregation operators and other functions used in such queries and show that it substantially improves performance. However, the query optimizer still produces suboptimal plans because of estimate errors. Furthermore, the optimization is very slow because of the large query size. We improved the optimization by a profiled grouping strategy where the scientific query is first automatically fragmented into subqueries based on application knowledge. Each fragment is then independently profiled on a sample of events to measure real execution cost and cardinality. An optimized fragmented query is shown to execute faster than a query optimized with the cost model alone. Furthermore, the total optimization time, including fragmentation and profiling, is substantially improved.
  • Keywords
    data analysis; physics computing; query processing; aggregation operators; complex queries; complex scientific queries; cost-based optimization; data analysis; high energy physics; large query size; optimized fragmented query; query optimizer; scientific query; Aggregates; Cost function; Databases; Dynamic programming; Filters; Information technology; Large Hadron Collider; Optimization methods; Query processing; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on
  • Conference_Location
    Banff, Alta.
  • ISSN
    1551-6393
  • Print_ISBN
    0-7695-2868-6
  • Electronic_ISBN
    1551-6393
  • Type

    conf

  • DOI
    10.1109/SSDBM.2007.8
  • Filename
    4274946