Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2007.10568v3
- Date: Wed, 27 Jul 2022 02:02:32 GMT
- Title: Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning
- Authors: Chi Zhang, Ryan Marcus, Anat Kleiman, Olga Papaemmanouil
- Abstract summary: We introduce SmartQueue, a learned scheduler that leverages overlapping data reads among incoming queries.
SmartQueue relies on deep reinforcement learning to produce workload-specific scheduling strategies.
We present results from a proof-of-concept prototype, demonstrating that learned schedulers can offer significant performance improvements.
- Score: 12.388301931687893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this extended abstract, we propose a new technique for query scheduling
with the explicit goal of reducing disk reads and thus implicitly increasing
query performance. We introduce SmartQueue, a learned scheduler that leverages
overlapping data reads among incoming queries and learns a scheduling strategy
that improves cache hits. SmartQueue relies on deep reinforcement learning to
produce workload-specific scheduling strategies that focus on long-term
performance benefits while being adaptive to previously-unseen data access
patterns. We present results from a proof-of-concept prototype, demonstrating
that learned schedulers can offer significant performance improvements over
hand-crafted scheduling heuristics. Ultimately, we make the case that this is a
promising research direction at the intersection of machine learning and
databases.
Related papers
- BQSched: A Non-intrusive Scheduler for Batch Concurrent Queries via Reinforcement Learning [7.738546538164454]
A key issue in minimizing the overall makespan of data pipelines is the efficient scheduling of concurrent queries.
To our knowledge, BQSched is the first non-intrusive batch query scheduler via reinforcement learning.
Extensive experiments show that BQSched can significantly improve the efficiency and stability of batch query scheduling.
arXiv Detail & Related papers (2025-04-27T07:49:01Z) - Queueing, Predictions, and LLMs: Challenges and Open Problems [9.22255012731159]
Queueing systems present opportunities for applying machine-learning predictions, such as estimated service times, to improve system performance.
Recent studies explore queues with predicted service times, typically aiming to minimize job time in the system.
We consider an important practical example of using predictions in scheduling, namely Large Language Model (LLM) systems.
arXiv Detail & Related papers (2025-03-10T17:12:47Z) - Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters [24.845122459974466]
This paper proposes an adaptive shortest-remaining-processing-time-first (A-SRPT) scheduling algorithm.
By modeling each job as a graph corresponding to heterogeneous Deep Neural Network (DNN) models, A-SRPT strategically assigns jobs to the available GPU.
A-SRPT maps the complex scheduling problem into a single-machine instance, which is addressed optimally by a preemptive "shortest-remaining-processing-time-first" strategy.
arXiv Detail & Related papers (2025-01-09T20:19:01Z) - Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs [59.76268575344119]
We introduce a novel framework for enhancing large language models' (LLMs) planning capabilities by using planning data derived from knowledge graphs (KGs)
LLMs fine-tuned with KG data have improved planning capabilities, better equipping them to handle complex QA tasks that involve retrieval.
arXiv Detail & Related papers (2024-06-20T13:07:38Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Schedule-Robust Online Continual Learning [45.325658404913945]
A continual learning algorithm learns from a non-stationary data stream.
A key challenge in CL is to design methods robust against arbitrary schedules over the same underlying data.
We present a new perspective on CL, as the process of learning a schedule-robust predictor, followed by adapting the predictor using only replay data.
arXiv Detail & Related papers (2022-10-11T15:55:06Z) - Accelerating Deep Learning Classification with Error-controlled
Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching.
While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error.
We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z) - Scheduling in Parallel Finite Buffer Systems: Optimal Decisions under
Delayed Feedback [29.177402567437206]
We present a partially observable (PO) model that captures the scheduling decisions in parallel queuing systems under limited information of delayed acknowledgements.
We numerically show that the resulting policy outperforms other limited information scheduling strategies.
We show how our approach can optimise the real-time parallel processing by using network data provided by Kaggle.
arXiv Detail & Related papers (2021-09-17T13:45:02Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Value Function Based Performance Optimization of Deep Learning Workloads [3.6827120585356528]
We present a new technique to accurately predict the expected performance of a partial schedule.
This enables us to find schedules that improve the throughput of deep neural networks by 2.6x over Halide and 1.5x over TVM.
arXiv Detail & Related papers (2020-11-30T01:20:14Z) - Learning from Data to Speed-up Sorted Table Search Procedures:
Methodology and Practical Guidelines [0.0]
We study to what extend Machine Learning Techniques can contribute to obtain such a speed-up.
We characterize the scenarios in which those latter can be profitably used with respect to the former, accounting for both CPU and GPU computing.
Indeed, we formalize an Algorithmic Paradigm of Learned Dichotomic Sorted Table Search procedures that naturally complements the Learned one proposed here and that characterizes most of the known Sorted Table Search Procedures as having a "learning phase" that approximates Simple Linear Regression.
arXiv Detail & Related papers (2020-07-20T16:26:54Z) - Temporally Correlated Task Scheduling for Sequence Learning [143.70523777803723]
In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks.
We introduce a learnable scheduler to sequence learning, which can adaptively select auxiliary tasks for training.
Our method significantly improves the performance of simultaneous machine translation and stock trend forecasting.
arXiv Detail & Related papers (2020-07-10T10:28:54Z) - Sequential Recommender via Time-aware Attentive Memory Network [67.26862011527986]
We propose a temporal gating methodology to improve attention mechanism and recurrent units.
We also propose a Multi-hop Time-aware Attentive Memory network to integrate long-term and short-term preferences.
Our approach is scalable for candidate retrieval tasks and can be viewed as a non-linear generalization of latent factorization for dot-product based Top-K recommendation.
arXiv Detail & Related papers (2020-05-18T11:29:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.