Reinforcement Learning for Assignment Problem with Time Constraints
- URL: http://arxiv.org/abs/2106.02856v1
- Date: Sat, 5 Jun 2021 10:10:52 GMT
- Title: Reinforcement Learning for Assignment Problem with Time Constraints
- Authors: Sharmin Pathan, Vyom Shrivastava
- Abstract summary: We present an end-to-end framework for the Assignment Problem with multiple tasks mapped to a group of workers.
We train a reinforcement learning agent to find near optimal solutions to the problem by minimizing total cost associated with the assignments.
We also demonstrate our results on bin packing and capacitated vehicle routing problem, using the same framework.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an end-to-end framework for the Assignment Problem with multiple
tasks mapped to a group of workers, using reinforcement learning while
preserving many constraints. Tasks and workers have time constraints and there
is a cost associated with assigning a worker to a task. Each worker can perform
multiple tasks until it exhausts its allowed time units (capacity). We train a
reinforcement learning agent to find near optimal solutions to the problem by
minimizing total cost associated with the assignments while maintaining hard
constraints. We use proximal policy optimization to optimize model parameters.
The model generates a sequence of actions in real-time which correspond to task
assignment to workers, without having to retrain for changes in the dynamic
state of the environment. In our problem setting reward is computed as negative
of the assignment cost. We also demonstrate our results on bin packing and
capacitated vehicle routing problem, using the same framework. Our results
outperform Google OR-Tools using MIP and CP-SAT solvers with large problem
instances, in terms of solution quality and computation time.
Related papers
- Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - FAMO: Fast Adaptive Multitask Optimization [48.59232177073481]
We introduce Fast Adaptive Multitask Optimization FAMO, a dynamic weighting method that decreases task losses in a balanced way.
Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient manipulation techniques.
arXiv Detail & Related papers (2023-06-06T15:39:54Z) - Robust Subtask Learning for Compositional Generalization [20.54144051436337]
We focus on the problem of training subtask policies in a way that they can be used to perform any task.
We aim to maximize the worst-case performance over all tasks as opposed to the average-case performance.
arXiv Detail & Related papers (2023-02-06T18:19:25Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - A Multiperiod Workforce Scheduling and Routing Problem with Dependent
Tasks [0.0]
We study a new Workforce Scheduling and Routing Problem.
In this problem, customers request services from a company.
Tasks belonging to a service may be executed by different teams, and customers may be visited more than once a day.
arXiv Detail & Related papers (2020-08-06T19:31:55Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - Time Adaptive Reinforcement Learning [2.0305676256390934]
Reinforcement learning (RL) allows to solve complex tasks such as Go often with a stronger performance than humans.
Here we consider the case of adapting RL agents to different time restrictions, such as finishing a task with a given time limit that might change from one task execution to the next.
We introduce two model-free, value-based algorithms: the Independent Gamma-Ensemble and the n-Step Ensemble.
arXiv Detail & Related papers (2020-04-18T11:52:07Z) - Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving [60.95424607008241]
We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
arXiv Detail & Related papers (2020-01-04T17:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.