Provable Sim-to-real Transfer in Continuous Domain with Partial
Observations
- URL: http://arxiv.org/abs/2210.15598v1
- Date: Thu, 27 Oct 2022 16:37:52 GMT
- Title: Provable Sim-to-real Transfer in Continuous Domain with Partial
Observations
- Authors: Jiachen Hu, Han Zhong, Chi Jin, Liwei Wang
- Abstract summary: Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world.
We show that a popular robust adversarial training algorithm is capable of learning a policy from the simulated environment that is competitive to the optimal policy in the real-world environment.
- Score: 39.18274543757048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sim-to-real transfer trains RL agents in the simulated environments and then
deploys them in the real world. Sim-to-real transfer has been widely used in
practice because it is often cheaper, safer and much faster to collect samples
in simulation than in the real world. Despite the empirical success of the
sim-to-real transfer, its theoretical foundation is much less understood. In
this paper, we study the sim-to-real transfer in continuous domain with partial
observations, where the simulated environments and real-world environments are
modeled by linear quadratic Gaussian (LQG) systems. We show that a popular
robust adversarial training algorithm is capable of learning a policy from the
simulated environment that is competitive to the optimal policy in the
real-world environment. To achieve our results, we design a new algorithm for
infinite-horizon average-cost LQGs and establish a regret bound that depends on
the intrinsic complexity of the model class. Our algorithm crucially relies on
a novel history clipping scheme, which might be of independent interest.
Related papers
- Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL [25.991354823569033]
We show that in many regimes, while direct sim2real transfer may fail, we can utilize the simulator to learn a set of emphexploratory policies.
In particular, in the setting of low-rank MDPs, we show that coupling these exploratory policies with simple, practical approaches.
This is the first evidence that simulation transfer yields a provable gain in reinforcement learning in settings where direct sim2real transfer fails.
arXiv Detail & Related papers (2024-10-26T19:12:27Z) - LoopSR: Looping Sim-and-Real for Lifelong Policy Adaptation of Legged Robots [20.715834172041763]
We propose a lifelong policy adaptation framework named LoopSR.
It reconstructs the real-world environments back in simulation for further improvement.
By leveraging the continual training, LoopSR achieves superior data efficiency compared with strong baselines.
arXiv Detail & Related papers (2024-09-26T16:02:25Z) - Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning [15.792914346054502]
We tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning ( CPP)
We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles.
We find that a high inference frequency allows first-order Markovian policies to transfer directly from simulation, while higher-order policies can be fine-tuned to further reduce the sim-to-real gap.
arXiv Detail & Related papers (2024-06-07T13:24:19Z) - Zero-shot Sim2Real Adaptation Across Environments [45.44896435487879]
We propose a Reverse Action Transformation (RAT) policy which learns to imitate simulated policies in the real-world.
RAT can then be deployed on top of a Universal Policy Network to achieve zero-shot adaptation to new environments.
arXiv Detail & Related papers (2023-02-08T11:59:07Z) - Understanding Domain Randomization for Sim-to-real Transfer [41.33483293243257]
We propose a theoretical framework for sim-to-real transfers, in which the simulator is modeled as a set of MDPs with tunable parameters.
We prove that sim-to-real transfer can succeed under mild conditions without any real-world training samples.
arXiv Detail & Related papers (2021-10-07T07:45:59Z) - Sim and Real: Better Together [47.14469055555684]
We demonstrate how to learn simultaneously from both simulation and interaction with the real environment.
We propose an algorithm for balancing the large number of samples from the high throughput but less accurate simulation.
We analyze such multi-environment interaction theoretically, and provide convergence properties, through a novel theoretical replay buffer analysis.
arXiv Detail & Related papers (2021-10-01T14:30:03Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial
Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks.
RL does not work directly in the real-world, which is known as the sim-to-real transfer problem.
We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z) - RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real [74.45688231140689]
We introduce the RL-scene consistency loss for image translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image.
We obtain RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning.
arXiv Detail & Related papers (2020-06-16T08:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.