FOSS: A Self-Learned Doctor for Query Optimizer
- URL: http://arxiv.org/abs/2312.06357v2
- Date: Wed, 14 Aug 2024 03:27:05 GMT
- Title: FOSS: A Self-Learned Doctor for Query Optimizer
- Authors: Kai Zhong, Luming Sun, Tao Ji, Cuiping Li, Hong Chen,
- Abstract summary: We introduce a novel framework for query optimization based on deep reinforcement learning, called FOSS.
Foss initiates optimization from the original plan generated by a traditional bootstrap and incrementally refines suboptimal nodes of the plan.
We evaluate the performance of FOSS on Join Order Benchmark, TPC-DS, and Stack Overflow benchmarks.
- Score: 19.15262900354963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Various works have utilized deep learning to address the query optimization problem in database system. They either learn to construct plans from scratch in a bottom-up manner or steer the plan generation behavior of traditional optimizer using hints. While these methods have achieved some success, they face challenges in either low training efficiency or limited plan search space. To address these challenges, we introduce FOSS, a novel framework for query optimization based on deep reinforcement learning. FOSS initiates optimization from the original plan generated by a traditional optimizer and incrementally refines suboptimal nodes of the plan through a sequence of actions. Additionally, we devise an asymmetric advantage model to evaluate the advantage between two plans. We integrate it with a traditional optimizer to form a simulated environment. Leveraging this simulated environment, FOSS can bootstrap itself to rapidly generate a large amount of high-quality simulated experiences. FOSS then learns from these experiences to improve its optimization capability. We evaluate the performance of FOSS on Join Order Benchmark, TPC-DS, and Stack Overflow. The experimental results demonstrate that FOSS outperforms the state-of-the-art methods in terms of latency performance. Compared to PostgreSQL, FOSS achieves speedup ranging from 1.15x to 8.33x in total latency across different benchmarks.
Related papers
- Large Scale Multi-Task Bayesian Optimization with Large Language Models [29.12351845364205]
We introduce a novel approach leveraging large language models (LLMs) to learn from, and improve upon, previous optimization trajectories.
We evaluate our method on two distinct domains: database query optimization and antimicrobial peptide design.
arXiv Detail & Related papers (2025-03-11T07:46:19Z) - HERO: Hint-Based Efficient and Reliable Query Optimizer [0.0]
We propose a novel model for learned query optimization which provides query hints leading to better execution plans.
The model addresses the three key challenges in learned hint-based query optimization: reliable hint recommendation, efficient hint exploration, and fast inference.
Our model is interpretable and easy to debug, which is particularly important for deployment in production.
arXiv Detail & Related papers (2024-12-03T10:58:34Z) - Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System [75.25394449773052]
Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving.
Yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods.
We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness.
arXiv Detail & Related papers (2024-10-10T17:00:06Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning [69.95292905263393]
We show that gradient-based optimization and large language models (MsLL) are complementary to each other, suggesting a collaborative optimization approach.
Our code is released at https://www.guozix.com/guozix/LLM-catalyst.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems.
Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order.
We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z) - BitE : Accelerating Learned Query Optimization in a Mixed-Workload
Environment [0.36700088931938835]
BitE is a novel ensemble learning model using database statistics and metadata to tune a learned query for enhancing performance.
Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods.
arXiv Detail & Related papers (2023-06-01T16:05:33Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods.
This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z) - Daydream: Accurately Estimating the Efficacy of Optimizations for DNN
Training [8.157520622932374]
profiling tools do not aim to answer predictive questions such as "How will optimization X affect the performance of my model?"
We propose a new profiling tool, Daydream, to help programmers efficiently explore the efficacy of DNN optimizations.
We show that Daydream is able to model most mainstream DNN optimization techniques, and accurately predict the efficacy of optimizations that will result in significant performance improvements.
arXiv Detail & Related papers (2020-06-05T09:08:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.