Deploying a Steered Query Optimizer in Production at Microsoft
- URL: http://arxiv.org/abs/2210.13625v1
- Date: Mon, 24 Oct 2022 21:57:57 GMT
- Title: Deploying a Steered Query Optimizer in Production at Microsoft
- Authors: Wangda Zhang, Matteo Interlandi, Paul Mineiro, Shi Qiao, Nasim
Ghazanfari Karlen Lie, Marc Friedman, Rafah Hosn, Hiren Patel, Alekh Jindal
- Abstract summary: We continue a recent line of work in steering a query towards better plans for a given workload, and make major strides in pushing previous research ideas to production.
Along the way we solve several challenges including, making steering actions more manageable, keeping the costs of steering within budget, and avoiding unexpected performance regressions in production.
Our resulting system, QQ-advisor, essentially externalizes the query planner to a massive offline pipeline for better exploration and specialization.
- Score: 10.647568709854877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern analytical workloads are highly heterogeneous and massively complex,
making generic query optimizers untenable for many customers and scenarios. As
a result, it is important to specialize these optimizers to instances of the
workloads. In this paper, we continue a recent line of work in steering a query
optimizer towards better plans for a given workload, and make major strides in
pushing previous research ideas to production deployment. Along the way we
solve several operational challenges including, making steering actions more
manageable, keeping the costs of steering within budget, and avoiding
unexpected performance regressions in production. Our resulting system,
QQ-advisor, essentially externalizes the query planner to a massive offline
pipeline for better exploration and specialization. We discuss various aspects
of our design and show detailed results over production SCOPE workloads at
Microsoft, where the system is currently enabled by default.
Related papers
- The Effect of Scheduling and Preemption on the Efficiency of LLM Inference Serving [8.552242818726347]
INFERMAX is an analytical framework that uses inference cost models to compare various schedulers.
Our findings indicate that preempting requests can reduce GPU costs by 30% compared to avoiding preemptions at all.
arXiv Detail & Related papers (2024-11-12T00:10:34Z) - Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes.
CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks.
It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z) - QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.
We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.
Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z) - Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model [3.0784574277021406]
We propose a holistic framework that enables robust query optimization based on a risk-aware learning approach.
Roq includes a novel formalization of the notion of robustness in the context of query optimization.
We demonstrate experimentally that Roq provides significant improvements to robust query optimization compared to the state-of-the-art.
arXiv Detail & Related papers (2024-01-26T21:16:37Z) - Sibyl: Forecasting Time-Evolving Query Workloads [9.16115447503004]
Database systems often rely on historical query traces to perform workload-based performance tuning.
Real production workloads are time-evolving, making historical queries ineffective for optimizing future workloads.
We propose SIBYL, an end-to-end machine learning-based framework that accurately forecasts a sequence of future queries.
arXiv Detail & Related papers (2024-01-08T08:11:32Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Large Language Models for Supply Chain Optimization [4.554094815136834]
We study how Large Language Models (LLMs) can help bridge the gap between supply chain automation and human comprehension and trust thereof.
We design OptiGuide -- a framework that accepts as input queries in plain text, and outputs insights about the underlying outcomes.
We demonstrate the effectiveness of our framework on a real server placement scenario within Microsoft's cloud supply chain.
arXiv Detail & Related papers (2023-07-08T01:42:22Z) - BitE : Accelerating Learned Query Optimization in a Mixed-Workload
Environment [0.36700088931938835]
BitE is a novel ensemble learning model using database statistics and metadata to tune a learned query for enhancing performance.
Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods.
arXiv Detail & Related papers (2023-06-01T16:05:33Z) - A Research Agenda for Artificial Intelligence in the Field of Flexible
Production Systems [53.47496941841855]
Production companies face problems when it comes to quickly adapting their production control to fluctuating demands or changing requirements.
Control approaches aiming to encapsulate production functions in the sense of services have shown to be promising in order to increase flexibility of Cyber-Physical Production Systems.
But an existing challenge of such approaches is finding production plans based on provided functionalities for a set of requirements, especially when there is no direct (i.e., syntactic) match between demanded and provided functions.
arXiv Detail & Related papers (2021-12-31T14:38:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.