Related papers: Lero: A Learning-to-Rank Query Optimizer

Lero: A Learning-to-Rank Query Optimizer

URL: http://arxiv.org/abs/2302.06873v1
Date: Tue, 14 Feb 2023 07:31:11 GMT
Title: Lero: A Learning-to-Rank Query Optimizer
Authors: Rong Zhu, Wei Chen, Bolin Ding, Xingguang Chen, Andreas Pfadler, Ziniu Wu, Jingren Zhou
Abstract summary: We introduce a learning to rank query, called Lero, which builds on top of the native query and continuously learns to improve query optimization. Rather than building a learned from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native. Lero achieves near optimal performance on several benchmarks.
Score: 49.841082217997354
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A recent line of works apply machine learning techniques to assist or rebuild cost based query optimizers in DBMS. While exhibiting superiority in some benchmarks, their deficiencies, e.g., unstable performance, high training cost, and slow model updating, stem from the inherent hardness of predicting the cost or latency of execution plans using machine learning models. In this paper, we introduce a learning to rank query optimizer, called Lero, which builds on top of the native query optimizer and continuously learns to improve query optimization. The key observation is that the relative order or rank of plans, rather than the exact cost or latency, is sufficient for query optimization. Lero employs a pairwise approach to train a classifier to compare any two plans and tell which one is better. Such a binary classification task is much easier than the regression task to predict the cost or latency, in terms of model efficiency and effectiveness. Rather than building a learned optimizer from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native optimizer. With its non intrusive design, Lero can be implemented on top of any existing DBMS with minimum integration efforts. We implement Lero and demonstrate its outstanding performance using PostgreSQL. In our experiments, Lero achieves near optimal performance on several benchmarks. It reduces the execution time of the native PostgreSQL optimizer by up to 70% and other learned query optimizers by up to 37%. Meanwhile, Lero continuously learns and automatically adapts to query workloads and changes in data.

Related papers

Large Scale Multi-Task Bayesian Optimization with Large Language Models [29.12351845364205]
We introduce a novel approach leveraging large language models (LLMs) to learn from, and improve upon, previous optimization trajectories. We evaluate our method on two distinct domains: database query optimization and antimicrobial peptide design.
arXiv Detail & Related papers (2025-03-11T07:46:19Z)
GenJoin: Conditional Generative Plan-to-Plan Query Optimizer that Learns from Subplan Hints [1.3108652488669732]
We present GenJoin, a novel learned query that considers the query optimization problem as a symbiotic generative task. GenJoin is the first learned query that significantly and consistently outperforms as well as state-of-the-art methods on two well-known real-world benchmarks.
arXiv Detail & Related papers (2024-11-07T08:31:01Z)
The Unreasonable Effectiveness of LLMs for Query Optimization [4.50924404547119]
We show that embeddings of query text contain useful semantic information for query optimization. We show that a simple binary deciding between alternative query plans, trained on a small number of embedded query vectors, can outperform existing systems.
arXiv Detail & Related papers (2024-11-05T07:10:00Z)
No more optimization rules: LLM-enabled policy-based multi-modal query optimizer [9.370719876854228]
Large language model (LLM) has marked a pivotal moment in the field of machine learning and deep learning. In this paper, we investigate the query optimization ability of LLM and use LLM to design LaPuda, a novel LLM and Policy based multi-modal query.
arXiv Detail & Related papers (2024-03-20T13:44:30Z)
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost. We present JoinGym, a query optimization environment for bushy reinforcement learning (RL) Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z)
Kepler: Robust Learning for Faster Parametric Query Optimization [5.6119420695093245]
We propose an end-to-end learning-based approach to parametric query optimization. Kepler achieves significant improvements in query runtime on multiple datasets.
arXiv Detail & Related papers (2023-06-11T22:39:28Z)
VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles. We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates. We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z)
Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods. This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z)
Reverse engineering learned optimizers reveals known and novel mechanisms [50.50540910474342]
Learneds are algorithms that can themselves be trained to solve optimization problems. Our results help elucidate the previously murky understanding of how learneds work, and establish tools for interpreting future learneds.
arXiv Detail & Related papers (2020-11-04T07:12:43Z)
Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves [53.37905268850274]
We introduce a new, hierarchical, neural network parameterized, hierarchical with access to additional features such as validation loss to enable automatic regularization. Most learneds have been trained on only a single task, or a small number of tasks. We train ours on thousands of tasks, making use of orders of magnitude more compute, resulting in generalizes that perform better to unseen tasks.
arXiv Detail & Related papers (2020-09-23T16:35:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.