Related papers: A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

URL: http://arxiv.org/abs/2101.01507v1
Date: Tue, 5 Jan 2021 13:47:45 GMT
Title: A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration
Authors: Hai Lan, Zhifeng Bao, Yuwei Peng
Abstract summary: A cost-based algorithm is adopted in almost all current database systems. In the cost model, cardinality, the number of the numbers through an operator plays a crucial role. Due to the inaccuracy in cardinality estimation, errors in cost, and the huge plan space model, the algorithm cannot find the optimal execution plan for a complex query in a reasonable time.
Score: 17.75042918159419
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Query optimizer is at the heart of the database systems. Cost-based optimizer studied in this paper is adopted in almost all current database systems. A cost-based optimizer introduces a plan enumeration algorithm to find a (sub)plan, and then uses a cost model to obtain the cost of that plan, and selects the plan with the lowest cost. In the cost model, cardinality, the number of tuples through an operator, plays a crucial role. Due to the inaccuracy in cardinality estimation, errors in cost model, and the huge plan space, the optimizer cannot find the optimal execution plan for a complex query in a reasonable time. In this paper, we first deeply study the causes behind the limitations above. Next, we review the techniques used to improve the quality of the three key components in the cost-based optimizer, cardinality estimation, cost model, and plan enumeration. We also provide our insights on the future directions for each of the above aspects.

Related papers

Reqo: A Robust and Explainable Query Optimization Cost Model [2.184775414778289]
We propose a tree model architecture based on Bidirectional Graph Neural Networks (Bi-GNN) aggregated by Gated Recurrent Units (GRUs) We implement a novel learning-to-rank cost model that effectively quantifies the uncertainty in cost estimates using approximate probabilistic ML. In addition, we propose the first explainability technique specifically designed for learning-based cost models.
arXiv Detail & Related papers (2025-01-29T04:48:51Z)
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability [59.72892401927283]
We evaluate the planning capabilities of OpenAI's o1 models across a variety of benchmark tasks. Our results reveal that o1-preview outperforms GPT-4 in adhering to task constraints.
arXiv Detail & Related papers (2024-09-30T03:58:43Z)
Cost-aware Bayesian Optimization via the Pandora's Box Gittins Index [57.045952766988925]
We develop a previously-unexplored connection between cost-aware Bayesian optimization and the Pandora's Box problem, a decision problem from economics. Our work constitutes a first step towards integrating techniques from Gittins index theory into Bayesian optimization.
arXiv Detail & Related papers (2024-06-28T17:20:13Z)
PRICE: A Pretrained Model for Cross-Database Cardinality Estimation [78.30959470441442]
Cardinality estimation (CardEst) is essential for optimizing query execution plans. Recent ML-based CardEst methods achieve high accuracy but face deployment challenges due to high preparation costs. We propose PRICE, a PRetrained multI-table CardEst model, which addresses these limitations.
arXiv Detail & Related papers (2024-06-03T06:21:53Z)
Budget-aware Query Tuning: An AutoML Perspective [14.561951257365953]
Modern database systems rely on cost-based querys to come up with good execution plans for input queries. We show that by varying the costunit values one can obtain query plans that significantly outperform the default query plans.
arXiv Detail & Related papers (2024-03-29T20:19:36Z)
Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model [3.0784574277021406]
We propose a holistic framework that enables robust query optimization based on a risk-aware learning approach. Roq includes a novel formalization of the notion of robustness in the context of query optimization. We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.
arXiv Detail & Related papers (2024-01-26T21:16:37Z)
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost. We present JoinGym, a query optimization environment for bushy reinforcement learning (RL) Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z)
Bayesian Optimization Over Iterative Learners with Structured Responses: A Budget-aware Planning Approach [31.918476422203412]
This paper proposes a novel approach referred to as Budget-Aware Planning for Iterative learners (BAPI) to solve HPO problems under a constrained cost budget. Experiments on diverse HPO benchmarks for iterative learners show that BAPI performs better than state-of-the-art baselines in most of the cases.
arXiv Detail & Related papers (2022-06-25T18:44:06Z)
Cost Optimal Planning as Satisfiability [5.482532589225552]
We use upper bounds on the length of cost optimal plans as horizons for a SAT-based encoding of planning with costs. We experimentally show that this SAT-based approach is able to compute plans with better costs, and in many cases it can match the optimal cost.
arXiv Detail & Related papers (2021-03-03T12:18:18Z)
On Exploiting Hitting Sets for Model Reconciliation [53.81101846598925]
In human-aware planning, a planning agent may need to provide an explanation to a human user on why its plan is optimal. A popular approach to do this is called model reconciliation, where the agent tries to reconcile the differences in its model and the human's model. We present a logic-based framework for model reconciliation that extends beyond the realm of planning.
arXiv Detail & Related papers (2020-12-16T21:25:53Z)
Optimal Bayesian experimental design for subsurface flow problems [77.34726150561087]
We propose a novel approach for development of chaos expansion (PCE) surrogate model for the design utility function. This novel technique enables the derivation of a reasonable quality response surface for the targeted objective function with a computational budget comparable to several single-point evaluations.
arXiv Detail & Related papers (2020-08-10T09:42:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.