A Scalable and Parallelizable Digital Twin Framework for Sustainable Sim2Real Transition of Multi-Agent Reinforcement Learning Systems
- URL: http://arxiv.org/abs/2403.10996v1
- Date: Sat, 16 Mar 2024 18:47:04 GMT
- Title: A Scalable and Parallelizable Digital Twin Framework for Sustainable Sim2Real Transition of Multi-Agent Reinforcement Learning Systems
- Authors: Chinmay Vilas Samak, Tanmay Vilas Samak, Venkat Krovi,
- Abstract summary: This work presents a sustainable multi-agent deep reinforcement learning framework capable of selectively scaling parallelized training workloads on-demand.
We introduce AutoDRIVE Ecosystem as an enabling digital twin framework to train, deploy, and transfer cooperative as well as competitive multi-agent reinforcement learning policies from simulation to reality.
- Score: 1.0582505915332336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents a sustainable multi-agent deep reinforcement learning framework capable of selectively scaling parallelized training workloads on-demand, and transferring the trained policies from simulation to reality using minimal hardware resources. We introduce AutoDRIVE Ecosystem as an enabling digital twin framework to train, deploy, and transfer cooperative as well as competitive multi-agent reinforcement learning policies from simulation to reality. Particularly, we first investigate an intersection traversal problem of 4 cooperative vehicles (Nigel) that share limited state information in single as well as multi-agent learning settings using a common policy approach. We then investigate an adversarial autonomous racing problem of 2 vehicles (F1TENTH) using an individual policy approach. In either set of experiments, a decentralized learning architecture was adopted, which allowed robust training and testing of the policies in stochastic environments. The agents were provided with realistically sparse observation spaces, and were restricted to sample control actions that implicitly satisfied the imposed kinodynamic and safety constraints. The experimental results for both problem statements are reported in terms of quantitative metrics and qualitative remarks for training as well as deployment phases. We also discuss agent and environment parallelization techniques adopted to efficiently accelerate MARL training, while analyzing their computational performance. Finally, we demonstrate a resource-aware transition of the trained policies from simulation to reality using the proposed digital twin framework.
Related papers
- Dual Policy Reinforcement Learning for Real-time Rebalancing in Bike-sharing Systems [13.083156894368532]
Bike-sharing systems play a crucial role in easing traffic congestion and promoting healthier lifestyles.
This study introduces a novel approach to address the real-time rebalancing problem with a fleet of vehicles.
It employs a dual policy reinforcement learning algorithm that decouples inventory and routing decisions.
arXiv Detail & Related papers (2024-06-02T21:05:23Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a novel diffusion-based controllable closed-loop safety-critical simulation framework.
We develop a novel approach to simulate safety-critical scenarios through an adversarial term in the denoising process.
We validate our framework empirically using the NuScenes dataset, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - Conformal Policy Learning for Sensorimotor Control Under Distribution
Shifts [61.929388479847525]
This paper focuses on the problem of detecting and reacting to changes in the distribution of a sensorimotor controller's observables.
The key idea is the design of switching policies that can take conformal quantiles as input.
We show how to design such policies by using conformal quantiles to switch between base policies with different characteristics.
arXiv Detail & Related papers (2023-11-02T17:59:30Z) - Multi-Agent Deep Reinforcement Learning for Cooperative and Competitive
Autonomous Vehicles using AutoDRIVE Ecosystem [1.1893676124374688]
We introduce AutoDRIVE Ecosystem as an enabler to develop physically accurate and graphically realistic digital twins of Nigel and F1TENTH.
We first investigate an intersection problem using a set of cooperative vehicles (Nigel) that share limited state information with each other in single as well as multi-agent learning settings.
We then investigate an adversarial head-to-head autonomous racing problem using a different set of vehicles (F1TENTH) in a multi-agent learning setting using an individual policy approach.
arXiv Detail & Related papers (2023-09-18T02:43:59Z) - Marginalized Importance Sampling for Off-Environment Policy Evaluation [13.824507564510503]
Reinforcement Learning (RL) methods are typically sample-inefficient, making it challenging to train and deploy RL-policies in real world robots.
This paper proposes a new approach to evaluate the real-world performance of agent policies prior to deploying them in the real world.
Our approach incorporates a simulator along with real-world offline data to evaluate the performance of any policy.
arXiv Detail & Related papers (2023-09-04T20:52:04Z) - Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with
Prompt Learning [4.195122359359966]
Large Language Models (LLMs) are trained on mass knowledge and proved to be equipped with astonishing inference abilities.
In this work, we leverage LLMs to understand and profile the system dynamics by a prompt-based grounded action transformation.
arXiv Detail & Related papers (2023-08-28T03:49:13Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - NeurIPS 2022 Competition: Driving SMARTS [60.948652154552136]
Driving SMARTS is a regular competition designed to tackle problems caused by the distribution shift in dynamic interaction contexts.
The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods.
arXiv Detail & Related papers (2022-11-14T17:10:53Z) - Learning Interactive Driving Policies via Data-driven Simulation [125.97811179463542]
Data-driven simulators promise high data-efficiency for driving policy learning.
Small underlying datasets often lack interesting and challenging edge cases for learning interactive driving.
We propose a simulation method that uses in-painted ado vehicles for learning robust driving policies.
arXiv Detail & Related papers (2021-11-23T20:14:02Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - CARLA Real Traffic Scenarios -- novel training ground and benchmark for
autonomous driving [8.287331387095545]
This work introduces interactive traffic scenarios in the CARLA simulator, which are based on real-world traffic.
We concentrate on tactical tasks lasting several seconds, which are especially challenging for current control methods.
The CARLA Real Traffic Scenarios (CRTS) is intended to be a training and testing ground for autonomous driving systems.
arXiv Detail & Related papers (2020-12-16T13:20:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.