A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation,
  Cost Model, and Plan Enumeration
        - URL: http://arxiv.org/abs/2101.01507v1
- Date: Tue, 5 Jan 2021 13:47:45 GMT
- Title: A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation,
  Cost Model, and Plan Enumeration
- Authors: Hai Lan, Zhifeng Bao, Yuwei Peng
- Abstract summary: A cost-based algorithm is adopted in almost all current database systems.
In the cost model, cardinality, the number of the numbers through an operator plays a crucial role.
Due to the inaccuracy in cardinality estimation, errors in cost, and the huge plan space model, the algorithm cannot find the optimal execution plan for a complex query in a reasonable time.
- Score: 17.75042918159419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Query optimizer is at the heart of the database systems. Cost-based optimizer
studied in this paper is adopted in almost all current database systems. A
cost-based optimizer introduces a plan enumeration algorithm to find a
(sub)plan, and then uses a cost model to obtain the cost of that plan, and
selects the plan with the lowest cost. In the cost model, cardinality, the
number of tuples through an operator, plays a crucial role. Due to the
inaccuracy in cardinality estimation, errors in cost model, and the huge plan
space, the optimizer cannot find the optimal execution plan for a complex query
in a reasonable time. In this paper, we first deeply study the causes behind
the limitations above. Next, we review the techniques used to improve the
quality of the three key components in the cost-based optimizer, cardinality
estimation, cost model, and plan enumeration. We also provide our insights on
the future directions for each of the above aspects.
 
      
        Related papers
        - Cost-aware Stopping for Bayesian Optimization [53.34052774820105]
 We propose a cost-aware stopping rule for Bayesian optimization that adapts to varying evaluation costs and is free of tuning.<n>We prove a theoretical guarantee bounding the expected cumulative evaluation cost incurred by our stopping rule when paired with state-of-the-art acquisition functions.
 arXiv  Detail & Related papers  (2025-07-16T17:54:14Z)
- Reqo: A Robust and Explainable Query Optimization Cost Model [2.184775414778289]
 We propose a tree model architecture based on Bidirectional Graph Neural Networks (Bi-GNN) aggregated by Gated Recurrent Units (GRUs)
We implement a novel learning-to-rank cost model that effectively quantifies the uncertainty in cost estimates using approximate probabilistic ML.
In addition, we propose the first explainability technique specifically designed for learning-based cost models.
 arXiv  Detail & Related papers  (2025-01-29T04:48:51Z)
- Self-Steering Optimization: Autonomous Preference Optimization for Large   Language Models [79.84205827056907]
 We present Self-Steering Optimization ($SSO$), an algorithm that autonomously generates high-quality preference data.<n>$SSO$ employs a specialized optimization objective to build a data generator from the policy model itself, which is used to produce accurate and on-policy data.<n>Our evaluation shows that $SSO$ consistently outperforms baselines in human preference alignment and reward optimization.
 arXiv  Detail & Related papers  (2024-10-22T16:04:03Z)
- On The Planning Abilities of OpenAI's o1 Models: Feasibility,   Optimality, and Generalizability [59.72892401927283]
 We evaluate the planning capabilities of OpenAI's o1 models across a variety of benchmark tasks.
Our results reveal that o1-preview outperforms GPT-4 in adhering to task constraints.
 arXiv  Detail & Related papers  (2024-09-30T03:58:43Z)
- Cost-aware Bayesian Optimization via the Pandora's Box Gittins Index [57.045952766988925]
 We develop a previously-unexplored connection between cost-aware Bayesian optimization and the Pandora's Box problem, a decision problem from economics.
Our work constitutes a first step towards integrating techniques from Gittins index theory into Bayesian optimization.
 arXiv  Detail & Related papers  (2024-06-28T17:20:13Z)
- PRICE: A Pretrained Model for Cross-Database Cardinality Estimation [78.30959470441442]
 Cardinality estimation (CardEst) is essential for optimizing query execution plans.
Recent ML-based CardEst methods achieve high accuracy but face deployment challenges due to high preparation costs.
We propose PRICE, a PRetrained multI-table CardEst model, which addresses these limitations.
 arXiv  Detail & Related papers  (2024-06-03T06:21:53Z)
- Budget-aware Query Tuning: An AutoML Perspective [14.561951257365953]
 Modern database systems rely on cost-based querys to come up with good execution plans for input queries.
We show that by varying the costunit values one can obtain query plans that significantly outperform the default query plans.
 arXiv  Detail & Related papers  (2024-03-29T20:19:36Z)
- Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model [3.0784574277021406]
 We propose a holistic framework that enables robust query optimization based on a risk-aware learning approach.
Roq includes a novel formalization of the notion of robustness in the context of query optimization.
We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.
 arXiv  Detail & Related papers  (2024-01-26T21:16:37Z)
- JoinGym: An Efficient Query Optimization Environment for Reinforcement
  Learning [58.71541261221863]
 Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
 arXiv  Detail & Related papers  (2023-07-21T17:00:06Z)
- Bayesian Optimization Over Iterative Learners with Structured Responses:
  A Budget-aware Planning Approach [31.918476422203412]
 This paper proposes a novel approach referred to as Budget-Aware Planning for Iterative learners (BAPI) to solve HPO problems under a constrained cost budget.
 Experiments on diverse HPO benchmarks for iterative learners show that BAPI performs better than state-of-the-art baselines in most of the cases.
 arXiv  Detail & Related papers  (2022-06-25T18:44:06Z)
- Cost Optimal Planning as Satisfiability [5.482532589225552]
 We use upper bounds on the length of cost optimal plans as horizons for a SAT-based encoding of planning with costs.
We experimentally show that this SAT-based approach is able to compute plans with better costs, and in many cases it can match the optimal cost.
 arXiv  Detail & Related papers  (2021-03-03T12:18:18Z)
- On Exploiting Hitting Sets for Model Reconciliation [53.81101846598925]
 In human-aware planning, a planning agent may need to provide an explanation to a human user on why its plan is optimal.
A popular approach to do this is called model reconciliation, where the agent tries to reconcile the differences in its model and the human's model.
We present a logic-based framework for model reconciliation that extends beyond the realm of planning.
 arXiv  Detail & Related papers  (2020-12-16T21:25:53Z)
- Optimal Bayesian experimental design for subsurface flow problems [77.34726150561087]
 We propose a novel approach for development of chaos expansion (PCE) surrogate model for the design utility function.
This novel technique enables the derivation of a reasonable quality response surface for the targeted objective function with a computational budget comparable to several single-point evaluations.
 arXiv  Detail & Related papers  (2020-08-10T09:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.