Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model
- URL: http://arxiv.org/abs/2401.15210v1
- Date: Fri, 26 Jan 2024 21:16:37 GMT
- Title: Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model
- Authors: Amin Kamali, Verena Kantere, Calisto Zuzarte, and Vincent Corvinelli
- Abstract summary: We propose a holistic framework that enables robust query optimization based on a risk-aware learning approach.
Roq includes a novel formalization of the notion of robustness in the context of query optimization.
We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.
- Score: 3.0784574277021406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Query optimizers in relational database management systems (RDBMSs) search
for execution plans expected to be optimal for a given queries. They use
parameter estimates, often inaccurate, and make assumptions that may not hold
in practice. Consequently, they may select execution plans that are suboptimal
at runtime, when these estimates and assumptions are not valid, which may
result in poor query performance. Therefore, query optimizers do not
sufficiently support robust query optimization. Recent years have seen a surge
of interest in using machine learning (ML) to improve efficiency of data
systems and reduce their maintenance overheads, with promising results obtained
in the area of query optimization in particular. In this paper, inspired by
these advancements, and based on several years of experience of IBM Db2 in this
journey, we propose Robust Optimization of Queries, (Roq), a holistic framework
that enables robust query optimization based on a risk-aware learning approach.
Roq includes a novel formalization of the notion of robustness in the context
of query optimization and a principled approach for its quantification and
measurement based on approximate probabilistic ML. It also includes novel
strategies and algorithms for query plan evaluation and selection. Roq also
includes a novel learned cost model that is designed to predict query execution
cost and the associated risks and performs query optimization accordingly. We
demonstrate experimentally that Roq provides significant improvements to robust
query optimization compared to the state-of-the-art.
Related papers
- Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models [26.353428245346166]
The Extract-Refine-Retrieve-Read (ERRR) framework is designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems.
Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting knowledge from Large Language Models (LLMs)
arXiv Detail & Related papers (2024-11-12T14:12:45Z) - Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations.
Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations.
We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z) - Hydro: Adaptive Query Processing of ML Queries [7.317548344184541]
We present Hydro, an adaptive query processing (AQP) for efficiently processing machine learning (ML) queries.
We demonstrate Hydro's efficacy through four illustrative use cases, delivering up to 11.52x speedup over a baseline system.
arXiv Detail & Related papers (2024-03-22T01:17:07Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - EASRec: Elastic Architecture Search for Efficient Long-term Sequential
Recommender Systems [82.76483989905961]
Current Sequential Recommender Systems (SRSs) suffer from computational and resource inefficiencies.
We develop the Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems (EASRec)
EASRec introduces data-aware gates that leverage historical information from input data batch to improve the performance of the recommendation network.
arXiv Detail & Related papers (2024-02-01T07:22:52Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Kepler: Robust Learning for Faster Parametric Query Optimization [5.6119420695093245]
We propose an end-to-end learning-based approach to parametric query optimization.
Kepler achieves significant improvements in query runtime on multiple datasets.
arXiv Detail & Related papers (2023-06-11T22:39:28Z) - BitE : Accelerating Learned Query Optimization in a Mixed-Workload
Environment [0.36700088931938835]
BitE is a novel ensemble learning model using database statistics and metadata to tune a learned query for enhancing performance.
Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods.
arXiv Detail & Related papers (2023-06-01T16:05:33Z) - Lero: A Learning-to-Rank Query Optimizer [49.841082217997354]
We introduce a learning to rank query, called Lero, which builds on top of the native query and continuously learns to improve query optimization.
Rather than building a learned from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native.
Lero achieves near optimal performance on several benchmarks.
arXiv Detail & Related papers (2023-02-14T07:31:11Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation,
Cost Model, and Plan Enumeration [17.75042918159419]
A cost-based algorithm is adopted in almost all current database systems.
In the cost model, cardinality, the number of the numbers through an operator plays a crucial role.
Due to the inaccuracy in cardinality estimation, errors in cost, and the huge plan space model, the algorithm cannot find the optimal execution plan for a complex query in a reasonable time.
arXiv Detail & Related papers (2021-01-05T13:47:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.