Fugu-MT 論文翻訳(概要): Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

論文の概要: Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

arxiv url: http://arxiv.org/abs/2603.10545v1
Date: Wed, 11 Mar 2026 08:54:30 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-12 16:22:32.857609
Title: Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning
Title（参考訳）: スコアの学習:強化学習によるクラスタスケジューラのチューニング
Authors: Martin Asenov, Qiwen Deng, Gingfung Yeung, Adam Barker,
Abstract要約: 本稿では,スケジューラスコアリングアルゴリズムにおける重み付け学習のための強化学習手法を提案する。私たちのアプローチは、パーセンテージ改善報酬、フレームスタッキング、ドメイン情報の制限に基づいています。提案手法は,実験室をベースとしたサーバレスシナリオにおいて,最大性能のベースラインと比較して,固定重量と12%と比較して平均33%パフォーマンスが向上することを示した。
参考スコア（独自算出の注目度）: 1.8584311789183756
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Efficiently allocating incoming jobs to nodes in large-scale clusters can lead to substantial improvements in both cluster utilization and job performance. In order to allocate incoming jobs, cluster schedulers usually rely on a set of scoring functions to rank feasible nodes. Results from individual scoring functions are usually weighted equally, which could lead to sub-optimal deployments as the one-size-fits-all solution does not take into account the characteristics of each workload. Tuning the weights of scoring functions, however, requires expert knowledge and is computationally expensive. This paper proposes a reinforcement learning approach for learning the weights in scheduler scoring algorithms with the overall objective of improving the end-to-end performance of jobs for a given cluster. Our approach is based on percentage improvement reward, frame-stacking, and limiting domain information. We propose a percentage improvement reward to address the objective of multi-step parameter tuning. The inclusion of frame-stacking allows for carrying information across an optimization experiment. Limiting domain information prevents overfitting and improves performance in unseen clusters and workloads. The policy is trained on different combinations of workloads and cluster setups. We demonstrate the proposed approach improves performance on average by 33\% compared to fixed weights and 12\% compared to the best-performing baseline in a lab-based serverless scenario.
Abstract（参考訳）: 大規模クラスタのノードに着信ジョブを効率的に割り当てることによって、クラスタの利用率とジョブのパフォーマンスが大幅に向上する可能性がある。入ってくるジョブを割り当てるために、クラスタスケジューラは通常、実行可能なノードをランク付けするためにスコアリング関数のセットに依存する。個々のスコアリング関数の結果は通常、均等に重み付けされるため、各ワークロードの特性を考慮していないため、最適以下のデプロイメントにつながる可能性がある。しかし、スコアリング関数の重みを調整するには専門家の知識が必要であり、計算コストがかかる。本稿では,スケジューラスコアリングアルゴリズムの重み付け学習のための強化学習手法を提案する。私たちのアプローチは、パーセンテージ改善報酬、フレームスタッキング、ドメイン情報の制限に基づいています。マルチステップパラメータチューニングの目的に対応するために,パーセンテージ改善報酬を提案する。フレームスタッキングを組み込むことで、最適化実験を通じて情報を運ぶことができる。ドメイン情報の制限は、目に見えないクラスタやワークロードのパフォーマンスの過度な適合を防ぎ、改善する。このポリシーは、ワークロードとクラスタのセットアップの異なる組み合わせに基づいて訓練されている。提案手法は,実験室をベースとしたサーバレスシナリオにおいて,最大性能のベースラインに比べて平均33倍,12倍の性能向上を実現している。

論文の概要: Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

関連論文リスト