Genetic Imitation Learning by Reward Extrapolation
- URL: http://arxiv.org/abs/2301.07182v1
- Date: Tue, 3 Jan 2023 14:12:28 GMT
- Title: Genetic Imitation Learning by Reward Extrapolation
- Authors: Boyuan Zheng, Jianlong Zhou and Fang Chen
- Abstract summary: We propose a method called GenIL that integrates the Genetic Algorithm with imitation learning.
The involvement of the Genetic Algorithm improves the data efficiency by reproducing trajectories with various returns.
We tested GenIL in both Atari and Mujoco domains, and the result shows that it successfully outperforms the previous methods.
- Score: 6.340280403330784
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning demonstrates remarkable performance in various domains.
However, imitation learning is also constrained by many prerequisites. The
research community has done intensive research to alleviate these constraints,
such as adding the stochastic policy to avoid unseen states, eliminating the
need for action labels, and learning from the suboptimal demonstrations.
Inspired by the natural reproduction process, we proposed a method called GenIL
that integrates the Genetic Algorithm with imitation learning. The involvement
of the Genetic Algorithm improves the data efficiency by reproducing
trajectories with various returns and assists the model in estimating more
accurate and compact reward function parameters. We tested GenIL in both Atari
and Mujoco domains, and the result shows that it successfully outperforms the
previous extrapolation methods over extrapolation accuracy, robustness, and
overall policy performance when input data is limited.
Related papers
- Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration [15.463313629574111]
This paper investigates how to achieve sample-efficient exploration in continuous control tasks.
We introduce an RL algorithm that incorporates a predictive model and off-policy learning elements.
We derive an intrinsic reward without incurring parameters overhead.
arXiv Detail & Related papers (2024-03-31T11:39:11Z) - Genetic-guided GFlowNets for Sample Efficient Molecular Optimization [33.270494123656746]
Recent advances in deep learning-based generative methods have shown promise but face the issue of sample efficiency.
This paper proposes a novel algorithm for sample-efficient molecular optimization by distilling a powerful genetic algorithm into deep generative policy.
arXiv Detail & Related papers (2024-02-05T04:12:40Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states.
We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z) - Missingness Augmentation: A General Approach for Improving Generative
Imputation Models [20.245637164975594]
We propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models.
As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks.
Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models.
arXiv Detail & Related papers (2021-07-31T08:51:46Z) - Behavior-based Neuroevolutionary Training in Reinforcement Learning [3.686320043830301]
This work presents a hybrid algorithm that combines neuroevolutionary optimization with value-based reinforcement learning.
For this purpose, we consolidate different methods to generate and optimize agent policies, creating a diverse population.
Our results indicate that combining methods can enhance the sample efficiency and learning speed for evolutionary approaches.
arXiv Detail & Related papers (2021-05-17T15:40:42Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Reparameterized Variational Divergence Minimization for Stable Imitation [57.06909373038396]
We study the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms.
We contribute a re parameterization trick for adversarial imitation learning to alleviate the challenges of the promising $f$-divergence minimization framework.
Empirically, we demonstrate that our design choices allow for ILO algorithms that outperform baseline approaches and more closely match expert performance in low-dimensional continuous-control tasks.
arXiv Detail & Related papers (2020-06-18T19:04:09Z) - Active Learning for Gaussian Process Considering Uncertainties with
Application to Shape Control of Composite Fuselage [7.358477502214471]
We propose two new active learning algorithms for the Gaussian process with uncertainties.
We show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance.
This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage.
arXiv Detail & Related papers (2020-04-23T02:04:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.