Related papers: Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model

Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model

URL: http://arxiv.org/abs/2401.15210v1
Date: Fri, 26 Jan 2024 21:16:37 GMT
Title: Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model
Authors: Amin Kamali, Verena Kantere, Calisto Zuzarte, and Vincent Corvinelli
Abstract summary: We propose a holistic framework that enables robust query optimization based on a risk-aware learning approach. Roq includes a novel formalization of the notion of robustness in the context of query optimization. We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.
Score: 3.0784574277021406
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Query optimizers in relational database management systems (RDBMSs) search for execution plans expected to be optimal for a given queries. They use parameter estimates, often inaccurate, and make assumptions that may not hold in practice. Consequently, they may select execution plans that are suboptimal at runtime, when these estimates and assumptions are not valid, which may result in poor query performance. Therefore, query optimizers do not sufficiently support robust query optimization. Recent years have seen a surge of interest in using machine learning (ML) to improve efficiency of data systems and reduce their maintenance overheads, with promising results obtained in the area of query optimization in particular. In this paper, inspired by these advancements, and based on several years of experience of IBM Db2 in this journey, we propose Robust Optimization of Queries, (Roq), a holistic framework that enables robust query optimization based on a risk-aware learning approach. Roq includes a novel formalization of the notion of robustness in the context of query optimization and a principled approach for its quantification and measurement based on approximate probabilistic ML. It also includes novel strategies and algorithms for query plan evaluation and selection. Roq also includes a novel learned cost model that is designed to predict query execution cost and the associated risks and performs query optimization accordingly. We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.

Related papers

Reqo: A Robust and Explainable Query Optimization Cost Model [2.184775414778289]
We propose a tree model architecture based on Bidirectional Graph Neural Networks (Bi-GNN) aggregated by Gated Recurrent Units (GRUs) We implement a novel learning-to-rank cost model that effectively quantifies the uncertainty in cost estimates using approximate probabilistic ML. In addition, we propose the first explainability technique specifically designed for learning-based cost models.
arXiv Detail & Related papers (2025-01-29T04:48:51Z)
HERO: Hint-Based Efficient and Reliable Query Optimizer [0.0]
We propose a novel model for learned query optimization which provides query hints leading to better execution plans. The model addresses the three key challenges in learned hint-based query optimization: reliable hint recommendation, efficient hint exploration, and fast inference. Our model is interpretable and easy to debug, which is particularly important for deployment in production.
arXiv Detail & Related papers (2024-12-03T10:58:34Z)
Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models [26.353428245346166]
The Extract-Refine-Retrieve-Read (ERRR) framework is designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems. Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting knowledge from Large Language Models (LLMs)
arXiv Detail & Related papers (2024-11-12T14:12:45Z)
Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations. Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations. We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z)
Hydro: Adaptive Query Processing of ML Queries [7.317548344184541]
We present Hydro, an adaptive query processing (AQP) for efficiently processing machine learning (ML) queries. We demonstrate Hydro's efficacy through four illustrative use cases, delivering up to 11.52x speedup over a baseline system.
arXiv Detail & Related papers (2024-03-22T01:17:07Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
EASRec: Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems [82.76483989905961]
Current Sequential Recommender Systems (SRSs) suffer from computational and resource inefficiencies. We develop the Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems (EASRec) EASRec introduces data-aware gates that leverage historical information from input data batch to improve the performance of the recommendation network.
arXiv Detail & Related papers (2024-02-01T07:22:52Z)
Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest. We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z)
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost. We present JoinGym, a query optimization environment for bushy reinforcement learning (RL) Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z)
Kepler: Robust Learning for Faster Parametric Query Optimization [5.6119420695093245]
We propose an end-to-end learning-based approach to parametric query optimization. Kepler achieves significant improvements in query runtime on multiple datasets.
arXiv Detail & Related papers (2023-06-11T22:39:28Z)
BitE : Accelerating Learned Query Optimization in a Mixed-Workload Environment [0.36700088931938835]
BitE is a novel ensemble learning model using database statistics and metadata to tune a learned query for enhancing performance. Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods.
arXiv Detail & Related papers (2023-06-01T16:05:33Z)
Best Arm Identification for Stochastic Rising Bandits [84.55453174601826]
Rising Bandits (SRBs) model sequential decision-making problems in which the expected reward of the available options increases every time they are selected. This paper focuses on the fixed-budget Best Arm Identification (BAI) problem for SRBs. We propose two algorithms to tackle the above-mentioned setting, namely R-UCBE and R-SR.
arXiv Detail & Related papers (2023-02-15T08:01:37Z)
Lero: A Learning-to-Rank Query Optimizer [49.841082217997354]
We introduce a learning to rank query, called Lero, which builds on top of the native query and continuously learns to improve query optimization. Rather than building a learned from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native. Lero achieves near optimal performance on several benchmarks.
arXiv Detail & Related papers (2023-02-14T07:31:11Z)
Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory. We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures. We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z)
A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration [17.75042918159419]
A cost-based algorithm is adopted in almost all current database systems. In the cost model, cardinality, the number of the numbers through an operator plays a crucial role. Due to the inaccuracy in cardinality estimation, errors in cost, and the huge plan space model, the algorithm cannot find the optimal execution plan for a complex query in a reasonable time.
arXiv Detail & Related papers (2021-01-05T13:47:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.