Auction-Based Scheduling
- URL: http://arxiv.org/abs/2310.11798v2
- Date: Wed, 31 Jan 2024 19:08:28 GMT
- Title: Auction-Based Scheduling
- Authors: Guy Avni, Kaushik Mallik, Suman Sadhukhan
- Abstract summary: Auction-based scheduling is a modular framework for multi-objective decision-making problems.
Each objective is fulfilled using a separate policy, and the policies can be independently created, modified, and replaced.
We present decentralized algorithms to synthesize a pair of policies, their initially allocated budgets, and bidding strategies.
- Score: 2.3326951882644553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many sequential decision-making tasks require satisfaction of multiple,
partially contradictory objectives. Existing approaches are monolithic, namely
all objectives are fulfilled using a single policy, which is a function that
selects a sequence of actions. We present auction-based scheduling, a modular
framework for multi-objective decision-making problems. Each objective is
fulfilled using a separate policy, and the policies can be independently
created, modified, and replaced. Understandably, different policies with
conflicting goals may choose conflicting actions at a given time. In order to
resolve conflicts, and compose policies, we employ a novel auction-based
mechanism. We allocate a bounded budget to each policy, and at each step, the
policies simultaneously bid from their available budgets for the privilege of
being scheduled and choosing an action. Policies express their scheduling
urgency using their bids and the bounded budgets ensure long-run scheduling
fairness. We lay the foundations of auction-based scheduling using path
planning problems on finite graphs with two temporal objectives. We present
decentralized algorithms to synthesize a pair of policies, their initially
allocated budgets, and bidding strategies. We consider three categories of
decentralized synthesis problems, parameterized by the assumptions that the
policies make on each other: (a) strong synthesis, with no assumptions and
strongest guarantees, (b) assume-admissible synthesis, with weakest rationality
assumptions, and (c) assume-guarantee synthesis, with explicit contract-based
assumptions. For reachability objectives, we show that, surprisingly,
decentralized assume-admissible synthesis is always possible when the
out-degrees of all vertices are at most two.
Related papers
- Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions.
We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z) - Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - Imitating Graph-Based Planning with Goal-Conditioned Policies [72.61631088613048]
We present a self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy.
We empirically show that our method can significantly boost the sample-efficiency of the existing goal-conditioned RL methods.
arXiv Detail & Related papers (2023-03-20T14:51:10Z) - Goal-conditioned Offline Reinforcement Learning through State Space Partitioning [9.38848713730931]
offline reinforcement learning (RL) aims to infer sequential decision policies using only offline datasets.
We argue that, despite its benefits, this approach is still insufficient to fully address the distribution shift and multi-modality problems.
We propose a complementary advantage-based weighting scheme that introduces an additional source of inductive bias.
arXiv Detail & Related papers (2023-03-16T14:52:53Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Constructing a Good Behavior Basis for Transfer using Generalized Policy
Updates [63.58053355357644]
We study the problem of learning a good set of policies, so that when combined together, they can solve a wide variety of unseen reinforcement learning tasks.
We show theoretically that having access to a specific set of diverse policies, which we call a set of independent policies, can allow for instantaneously achieving high-level performance.
arXiv Detail & Related papers (2021-12-30T12:20:46Z) - Anytime Stochastic Task and Motion Policies [12.72186877599064]
We present a new approach for integrated task and motion planning in settings.
Our algorithm is probabilistically complete and can compute feasible solution policies in an anytime fashion.
arXiv Detail & Related papers (2021-08-28T00:23:39Z) - Composable Energy Policies for Reactive Motion Generation and
Reinforcement Learning [25.498555742173323]
We introduce Composable Energy Policies (CEP), a novel framework for modular motion generation.
CEP computes the control action by optimization over the product of a set of reactive policies.
CEP naturally adapts to the Reinforcement Learning problem allowing us to integrate, in a hierarchical fashion, any distribution as prior.
arXiv Detail & Related papers (2021-05-11T11:59:13Z) - CRPO: A New Approach for Safe Reinforcement Learning with Convergence
Guarantee [61.176159046544946]
In safe reinforcement learning (SRL) problems, an agent explores the environment to maximize an expected total reward and avoids violation of certain constraints.
This is the first-time analysis of SRL algorithms with global optimal policies.
arXiv Detail & Related papers (2020-11-11T16:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.