Multi-Agent Reinforcement Learning with Long-Term Performance Objectives for Service Workforce Optimization
- URL: http://arxiv.org/abs/2503.01069v1
- Date: Mon, 03 Mar 2025 00:16:47 GMT
- Title: Multi-Agent Reinforcement Learning with Long-Term Performance Objectives for Service Workforce Optimization
- Authors: Kareem Eissa, Rayal Prasad, Sarith Mohan, Ankur Kapoor, Dorin Comaniciu, Vivek Singh,
- Abstract summary: Our aim is to create a simulator that models a unified workforce optimization problem.<n> Specifically, we designed a modular simulator to support the development of reinforcement learning methods.<n>The simulator provides parameterizations to help explore dynamic scenarios with varying levels of ablationity and non-stationarity.
- Score: 2.865067924658368
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Workforce optimization plays a crucial role in efficient organizational operations where decision-making may span several different administrative and time scales. For instance, dispatching personnel to immediate service requests while managing talent acquisition with various expertise sets up a highly dynamic optimization problem. Existing work focuses on specific sub-problems such as resource allocation and facility location, which are solved with heuristics like local-search and, more recently, deep reinforcement learning. However, these may not accurately represent real-world scenarios where such sub-problems are not fully independent. Our aim is to fill this gap by creating a simulator that models a unified workforce optimization problem. Specifically, we designed a modular simulator to support the development of reinforcement learning methods for integrated workforce optimization problems. We focus on three interdependent aspects: personnel dispatch, workforce management, and personnel positioning. The simulator provides configurable parameterizations to help explore dynamic scenarios with varying levels of stochasticity and non-stationarity. To facilitate benchmarking and ablation studies, we also include heuristic and RL baselines for the above mentioned aspects.
Related papers
- Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study [11.452011929848844]
This study proposes a novel meta-surrogate framework to assist many-task optimization.
We formulate a unified framework for many-task fitness prediction, by defining a universal model with metadata to fit a group of problems.
Our framework supports dual-level knowledge transfer -- at both the surrogate and individual levels -- enhancing optimization efficiency and robustness.
arXiv Detail & Related papers (2025-03-11T11:13:11Z) - Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents.
We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning.
We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z) - A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops [3.729242965449096]
This paper introduces a framework for autonomously optimizing Agentic AI solutions across industries.<n>The framework achieves optimal performance without human input by autonomously generating and testing hypotheses.<n>Case studies show significant improvements in output quality, relevance, and actionability.
arXiv Detail & Related papers (2024-12-22T20:08:04Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Orchestration of Emulator Assisted Mobile Edge Tuning for AI Foundation
Models: A Multi-Agent Deep Reinforcement Learning Approach [10.47302625959368]
We present a groundbreaking paradigm integrating Mobile Edge Computing with foundation models, specifically designed to enhance local task performance on user equipment (UE)
Central to our approach is the innovative Emulator-Adapter architecture, segmenting the foundation model into two cohesive modules.
We introduce an advanced resource allocation mechanism that is fine-tuned to the needs of the Emulator-Adapter structure in decentralized settings.
arXiv Detail & Related papers (2023-10-26T15:47:51Z) - Reinforcement Learning Approach for Multi-Agent Flexible Scheduling
Problems [0.0]
This research presents a Reinforcement Learning approach for scheduling problems.
In particular, this study delivers an OpenAI gym environment with search-space reduction for Job Shop Scheduling Problems.
arXiv Detail & Related papers (2022-10-07T16:31:01Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - Reinforcement Learning for Location-Aware Scheduling [1.0660480034605238]
We show how various aspects of the warehouse environment affect performance and execution priority.
We propose a compact representation of the state and action space for location-aware multi-agent systems.
We also show how agents trained in certain environments maintain performance in completely unseen settings.
arXiv Detail & Related papers (2022-03-07T15:51:00Z) - Energy-Efficient Multi-Orchestrator Mobile Edge Learning [54.28419430315478]
Mobile Edge Learning (MEL) is a collaborative learning paradigm that features distributed training of Machine Learning (ML) models over edge devices.
In MEL, possible coexistence of multiple learning tasks with different datasets may arise.
We propose lightweight algorithms that can achieve near-optimal performance and facilitate the trade-offs between energy consumption, accuracy, and solution complexity.
arXiv Detail & Related papers (2021-09-02T07:37:10Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Auxiliary-task Based Deep Reinforcement Learning for Participant
Selection Problem in Mobile Crowdsourcing [30.124365580284888]
In mobile crowdsourcing, the platform selects participants to complete location-aware tasks from the recruiters aiming to achieve multiple goals.
Different MCS systems have different goals and there are possibly conflicting goals even in one MCS system.
It is crucial to design a participant selection algorithm that applies to different MCS systems to achieve multiple goals.
arXiv Detail & Related papers (2020-08-25T15:02:54Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.