Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement
Learning
- URL: http://arxiv.org/abs/2307.08033v1
- Date: Sun, 16 Jul 2023 13:04:40 GMT
- Title: Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement
Learning
- Authors: Hongyu Ding, Yuanze Tang, Qing Wu, Bo Wang, Chunlin Chen, Zhi Wang
- Abstract summary: Reward shaping is a practical approach to improving sample efficiency by embedding human domain knowledge into the learning process.
This paper proposes a novel magnetic field-based reward shaping (MFRS) method for goal-conditioned RL tasks with dynamic target and obstacles.
Experiments results in both simulated and real-world robotic manipulation tasks demonstrate that MFRS outperforms relevant existing methods.
- Score: 16.224372286510558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Goal-conditioned reinforcement learning (RL) is an interesting extension of
the traditional RL framework, where the dynamic environment and reward sparsity
can cause conventional learning algorithms to fail. Reward shaping is a
practical approach to improving sample efficiency by embedding human domain
knowledge into the learning process. Existing reward shaping methods for
goal-conditioned RL are typically built on distance metrics with a linear and
isotropic distribution, which may fail to provide sufficient information about
the ever-changing environment with high complexity. This paper proposes a novel
magnetic field-based reward shaping (MFRS) method for goal-conditioned RL tasks
with dynamic target and obstacles. Inspired by the physical properties of
magnets, we consider the target and obstacles as permanent magnets and
establish the reward function according to the intensity values of the magnetic
field generated by these magnets. The nonlinear and anisotropic distribution of
the magnetic field intensity can provide more accessible and conducive
information about the optimization landscape, thus introducing a more
sophisticated magnetic reward compared to the distance-based setting. Further,
we transform our magnetic reward to the form of potential-based reward shaping
by learning a secondary potential function concurrently to ensure the optimal
policy invariance of our method. Experiments results in both simulated and
real-world robotic manipulation tasks demonstrate that MFRS outperforms
relevant existing methods and effectively improves the sample efficiency of RL
algorithms in goal-conditioned tasks with various dynamics of the target and
obstacles.
Related papers
- Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning [67.95280175998792]
A novel adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association ins.
We employ inverse RL (IRL) to automatically learn reward functions without manual tuning.
We show that the proposed MA-AL method outperforms traditional RL approaches, achieving a $14.6%$ improvement in convergence and reward value.
arXiv Detail & Related papers (2024-09-27T13:05:02Z) - Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation [36.308936312224404]
This paper introduces Adaptive Horizon Actor-Critic (AHAC), an FO-MBRL algorithm that reduces gradient error by adapting the model-based horizon to avoid stiff dynamics.
Empirical findings reveal that AHAC outperforms MFRL baselines, attaining 40% more reward across a set of locomotion tasks and efficiently scaling to high-dimensional control environments with improved wall-clock-time efficiency.
arXiv Detail & Related papers (2024-05-28T03:28:00Z) - Neural-Kernel Conditional Mean Embeddings [26.862984140099837]
Kernel conditional mean embeddings (CMEs) offer a powerful framework for representing conditional distribution, but they often face scalability and challenges.
We propose a new method that effectively combines the strengths of deep learning with CMEs in order to address these challenges.
In conditional density estimation tasks, our NN-CME hybrid achieves competitive performance and often surpasses existing deep learning-based methods.
arXiv Detail & Related papers (2024-03-16T08:51:02Z) - Leveraging Optimal Transport for Enhanced Offline Reinforcement Learning
in Surgical Robotic Environments [4.2569494803130565]
We introduce an innovative algorithm designed to assign rewards to offline trajectories, using a small number of high-quality expert demonstrations.
This approach circumvents the need for handcrafted rewards, unlocking the potential to harness vast datasets for policy learning.
arXiv Detail & Related papers (2023-10-13T03:39:15Z) - Self-Supervised Knowledge-Driven Deep Learning for 3D Magnetic Inversion [6.001304967469112]
The proposed self-supervised knowledge-driven 3D magnetic inversion method learns on the target field data by a closed loop of the inversion and forward models.
There is a knowledge-driven module in the proposed inversion model, which makes the deep learning method more explicable.
The experimental results demonstrate that the proposed method is a reliable magnetic inversion method with outstanding performance.
arXiv Detail & Related papers (2023-08-23T15:31:38Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Reinforcement Learning from Diverse Human Preferences [68.4294547285359]
This paper develops a method for crowd-sourcing preference labels and learning from diverse human preferences.
The proposed method is tested on a variety of tasks in DMcontrol and Meta-world.
It has shown consistent and significant improvements over existing preference-based RL algorithms when learning from diverse feedback.
arXiv Detail & Related papers (2023-01-27T15:18:54Z) - Guaranteed Conservation of Momentum for Learning Particle-based Fluid
Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations.
We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers.
In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z) - Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain [11.075036222901417]
We propose an approach for inverse reinforcement learning from hetero-domain which learns a reward function in the simulator, drawing on the demonstrations from the real world.
The intuition behind the method is that the reward function should not only be oriented to imitate the experts, but should encourage actions adjusted for the dynamics difference between the simulator and the real world.
arXiv Detail & Related papers (2021-10-21T19:23:15Z) - Variational Empowerment as Representation Learning for Goal-Based
Reinforcement Learning [114.07623388322048]
We discuss how the standard goal-conditioned RL (GCRL) is encapsulated by the objective variational empowerment.
Our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.
arXiv Detail & Related papers (2021-06-02T18:12:26Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.