Sibyl: Forecasting Time-Evolving Query Workloads
- URL: http://arxiv.org/abs/2401.03723v1
- Date: Mon, 8 Jan 2024 08:11:32 GMT
- Title: Sibyl: Forecasting Time-Evolving Query Workloads
- Authors: Hanxian Huang, Tarique Siddiqui, Rana Alotaibi, Carlo Curino, Jyoti
Leeka, Alekh Jindal, Jishen Zhao, Jesus Camacho-Rodriguez, Yuanyuan Tian
- Abstract summary: Database systems often rely on historical query traces to perform workload-based performance tuning.
Real production workloads are time-evolving, making historical queries ineffective for optimizing future workloads.
We propose SIBYL, an end-to-end machine learning-based framework that accurately forecasts a sequence of future queries.
- Score: 9.16115447503004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Database systems often rely on historical query traces to perform
workload-based performance tuning. However, real production workloads are
time-evolving, making historical queries ineffective for optimizing future
workloads. To address this challenge, we propose SIBYL, an end-to-end machine
learning-based framework that accurately forecasts a sequence of future
queries, with the entire query statements, in various prediction windows.
Drawing insights from real-workloads, we propose template-based featurization
techniques and develop a stacked-LSTM with an encoder-decoder architecture for
accurate forecasting of query workloads. We also develop techniques to improve
forecasting accuracy over large prediction windows and achieve high scalability
over large workloads with high variability in arrival rates of queries.
Finally, we propose techniques to handle workload drifts. Our evaluation on
four real workloads demonstrates that SIBYL can forecast workloads with an
$87.3\%$ median F1 score, and can result in $1.7\times$ and $1.3\times$
performance improvement when applied to materialized view selection and index
selection applications, respectively.
Related papers
- F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning [93.96463520716759]
Large language model (LLM) agents have demonstrated impressive capabilities in utilizing external tools and knowledge to boost accuracy and hallucinations.
Here, we introduce AvaTaR, a novel and automated framework that optimize an LLM agent to effectively leverage provided tools, improving performance on a given task.
arXiv Detail & Related papers (2024-06-17T04:20:02Z) - PePNet: A Periodicity-Perceived Workload Prediction Network Supporting Rare Occurrence of Heavy Workload [11.93843096959306]
workload of cloud servers is highly variable, with occasional heavy workload bursts.
There are two categories of workload prediction methods: statistical methods and neural-network-based ones.
We propose PePNet to improve overall especially heavy workload prediction accuracy.
arXiv Detail & Related papers (2023-07-11T07:56:27Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - Kepler: Robust Learning for Faster Parametric Query Optimization [5.6119420695093245]
We propose an end-to-end learning-based approach to parametric query optimization.
Kepler achieves significant improvements in query runtime on multiple datasets.
arXiv Detail & Related papers (2023-06-11T22:39:28Z) - BitE : Accelerating Learned Query Optimization in a Mixed-Workload
Environment [0.36700088931938835]
BitE is a novel ensemble learning model using database statistics and metadata to tune a learned query for enhancing performance.
Our model achieves 19.6% more improved queries and 15.8% less regressed queries compared to the existing traditional methods.
arXiv Detail & Related papers (2023-06-01T16:05:33Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Database Workload Characterization with Query Plan Encoders [32.941042348628606]
We propose our query plan encoders that learn essential features and their correlations from query plans.
Our pretrained encoders capture the em structural and the em computational performance of queries independently.
arXiv Detail & Related papers (2021-05-26T01:17:27Z) - FES: A Fast Efficient Scalable QoS Prediction Framework [0.9176056742068814]
One of the primary objectives of designing a prediction algorithm is to achieve satisfactory prediction accuracy.
The algorithm has to be faster in terms of prediction time so that it can be integrated into a real-time recommendation system.
The existing algorithms on prediction often compromise on one goal while ensuring the others.
arXiv Detail & Related papers (2021-03-12T19:28:17Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z) - Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks.
We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.