Scalable Semantic Non-Markovian Simulation Proxy for Reinforcement
Learning
- URL: http://arxiv.org/abs/2310.06835v2
- Date: Sun, 15 Oct 2023 01:14:23 GMT
- Title: Scalable Semantic Non-Markovian Simulation Proxy for Reinforcement
Learning
- Authors: Kaustuv Mukherji, Devendra Parkar, Lahari Pokala, Dyuman Aditya, Paulo
Shakarian, Clark Dorman
- Abstract summary: We propose a semantic proxy for simulation based on a temporal extension to annotated logic.
We show up to three orders of magnitude speed-up while preserving the quality of policy learned.
- Score: 0.125828876338076
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in reinforcement learning (RL) have shown much promise across
a variety of applications. However, issues such as scalability, explainability,
and Markovian assumptions limit its applicability in certain domains. We
observe that many of these shortcomings emanate from the simulator as opposed
to the RL training algorithms themselves. As such, we propose a semantic proxy
for simulation based on a temporal extension to annotated logic. In comparison
with two high-fidelity simulators, we show up to three orders of magnitude
speed-up while preserving the quality of policy learned. In addition, we show
the ability to model and leverage non-Markovian dynamics and instantaneous
actions while providing an explainable trace describing the outcomes of the
agent actions.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Anchored Learning for On-the-Fly Adaptation -- Extended Technical Report [45.123633153460034]
This study presents "anchor critics", a novel strategy for enhancing the robustness of reinforcement learning (RL) agents in crossing the sim-to-real gap.
We identify that naive fine-tuning approaches lead to catastrophic forgetting, where policies maintain high rewards on frequently encountered states but lose performance on rarer, yet critical scenarios.
Evaluations demonstrate that our approach enables behavior retention in sim-to-sim gymnasium tasks and in sim-to-real scenarios with racing quadrotors, achieving a near-50% reduction in power consumption while maintaining controllable, stable flight.
arXiv Detail & Related papers (2023-01-17T16:16:53Z) - Guaranteed Conservation of Momentum for Learning Particle-based Fluid
Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations.
We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers.
In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z) - Backpropagation through Time and Space: Learning Numerical Methods with
Multi-Agent Reinforcement Learning [6.598324641949299]
We treat the numerical schemes underlying partial differential equations as a Partially Observable Markov Game (OMG) in Reinforcement Learning (RL)
Similar to numerical solvers, our agent acts at each discrete location a computational space for efficient generalizable learning.
To learn higher-order spatial methods by acting on local states, the agent must discern how its actions at a giventemporal location affect the future evolution of the state.
arXiv Detail & Related papers (2022-03-16T20:50:24Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - Large-Scale Multi-Agent Deep FBSDEs [28.525065041507982]
We present a framework for finding Markovian Nash Equilibria in multi-agent games using fictitious play.
We showcase superior performance of our framework over the state-of-the-art deep fictitious play algorithm.
We also demonstrate the applicability of our framework in robotics on a belief space autonomous racing problem.
arXiv Detail & Related papers (2020-11-21T23:00:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.