Balance Between Efficient and Effective Learning: Dense2Sparse Reward
Shaping for Robot Manipulation with Environment Uncertainty
- URL: http://arxiv.org/abs/2003.02740v1
- Date: Thu, 5 Mar 2020 16:10:15 GMT
- Title: Balance Between Efficient and Effective Learning: Dense2Sparse Reward
Shaping for Robot Manipulation with Environment Uncertainty
- Authors: Yongle Luo, Kun Dong, Lili Zhao, Zhiyong Sun, Chao Zhou, Bo Song
- Abstract summary: We propose a simple but powerful reward shaping method, namely Dense2Sparse.
It combines the advantage of fast convergence of dense reward and the noise isolation of the sparse reward, to achieve a balance between learning efficiency and effectiveness.
The experiment results show that the Dense2Sparse method obtained higher expected reward compared with the ones using standalone dense reward or sparse reward, and it also has a superior tolerance of system uncertainty.
- Score: 14.178202899299267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient and effective learning is one of the ultimate goals of the deep
reinforcement learning (DRL), although the compromise has been made in most of
the time, especially for the application of robot manipulations. Learning is
always expensive for robot manipulation tasks and the learning effectiveness
could be affected by the system uncertainty. In order to solve above
challenges, in this study, we proposed a simple but powerful reward shaping
method, namely Dense2Sparse. It combines the advantage of fast convergence of
dense reward and the noise isolation of the sparse reward, to achieve a balance
between learning efficiency and effectiveness, which makes it suitable for
robot manipulation tasks. We evaluated our Dense2Sparse method with a series of
ablation experiments using the state representation model with system
uncertainty. The experiment results show that the Dense2Sparse method obtained
higher expected reward compared with the ones using standalone dense reward or
sparse reward, and it also has a superior tolerance of system uncertainty.
Related papers
- Learning Reward for Robot Skills Using Large Language Models via Self-Alignment [11.639973274337274]
Large Language Models (LLM) contain valuable task-related knowledge that can potentially aid in the learning of reward functions.
We propose a method to learn rewards more efficiently in the absence of humans.
arXiv Detail & Related papers (2024-05-12T04:57:43Z) - Unsupervised Learning of Effective Actions in Robotics [0.9374652839580183]
Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions.
We propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes"
We evaluate our method on a simulated stair-climbing reinforcement learning task.
arXiv Detail & Related papers (2024-04-03T13:28:52Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Tactile Active Inference Reinforcement Learning for Efficient Robotic
Manipulation Skill Acquisition [10.072992621244042]
We propose a novel method for skill learning in robotic manipulation called Tactile Active Inference Reinforcement Learning (Tactile-AIRL)
To enhance the performance of reinforcement learning (RL), we introduce active inference, which integrates model-based techniques and intrinsic curiosity into the RL process.
We demonstrate that our method achieves significantly high training efficiency in non-prehensile objects pushing tasks.
arXiv Detail & Related papers (2023-11-19T10:19:22Z) - Planning for Sample Efficient Imitation Learning [52.44953015011569]
Current imitation algorithms struggle to achieve high performance and high in-environment sample efficiency simultaneously.
We propose EfficientImitate, a planning-based imitation learning method that can achieve high in-environment sample efficiency and performance simultaneously.
Experimental results show that EI achieves state-of-the-art results in performance and sample efficiency.
arXiv Detail & Related papers (2022-10-18T05:19:26Z) - Relay Hindsight Experience Replay: Continual Reinforcement Learning for
Robot Manipulation Tasks with Sparse Rewards [26.998587654269873]
We propose a novel model-free continual RL algorithm, called Relay-HER (RHER)
The proposed method first decomposes and rearranges the original long-horizon task into new sub-tasks with incremental complexity.
The experimental results indicate the significant improvements in sample efficiency of RHER compared to vanilla-HER in five typical robot manipulation tasks.
arXiv Detail & Related papers (2022-08-01T13:30:01Z) - Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot
Learning [121.9708998627352]
Recent work has shown that, in practical robot learning applications, the effects of adversarial training do not pose a fair trade-off.
This work revisits the robustness-accuracy trade-off in robot learning by analyzing if recent advances in robust training methods and theory can make adversarial training suitable for real-world robot applications.
arXiv Detail & Related papers (2022-04-15T08:12:15Z) - Self-Imitation Learning for Robot Tasks with Sparse and Delayed Rewards [1.2691047660244335]
We propose a practical self-imitation learning method named Self-Imitation Learning with Constant Reward (SILCR)
Our method assigns the immediate rewards at each timestep with constant values according to their final episodic rewards.
We demonstrate the effectiveness of our method in some challenging continuous robotics control tasks in MuJoCo simulation.
arXiv Detail & Related papers (2020-10-14T11:12:07Z) - Learning Sparse Rewarded Tasks from Sub-Optimal Demonstrations [78.94386823185724]
Imitation learning learns effectively in sparse-rewarded tasks by leveraging the existing expert demonstrations.
In practice, collecting a sufficient amount of expert demonstrations can be prohibitively expensive.
We propose Self-Adaptive Learning (SAIL) that can achieve (near) optimal performance given only a limited number of sub-optimal demonstrations.
arXiv Detail & Related papers (2020-04-01T15:57:15Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z) - Soft Hindsight Experience Replay [77.99182201815763]
Soft Hindsight Experience Replay (SHER) is a novel approach based on HER and Maximum Entropy Reinforcement Learning (MERL)
We evaluate SHER on Open AI Robotic manipulation tasks with sparse rewards.
arXiv Detail & Related papers (2020-02-06T03:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.