Genetic Imitation Learning by Reward Extrapolation
- URL: http://arxiv.org/abs/2301.07182v1
- Date: Tue, 3 Jan 2023 14:12:28 GMT
- Title: Genetic Imitation Learning by Reward Extrapolation
- Authors: Boyuan Zheng, Jianlong Zhou and Fang Chen
- Abstract summary: We propose a method called GenIL that integrates the Genetic Algorithm with imitation learning.
The involvement of the Genetic Algorithm improves the data efficiency by reproducing trajectories with various returns.
We tested GenIL in both Atari and Mujoco domains, and the result shows that it successfully outperforms the previous methods.
- Score: 6.340280403330784
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning demonstrates remarkable performance in various domains.
However, imitation learning is also constrained by many prerequisites. The
research community has done intensive research to alleviate these constraints,
such as adding the stochastic policy to avoid unseen states, eliminating the
need for action labels, and learning from the suboptimal demonstrations.
Inspired by the natural reproduction process, we proposed a method called GenIL
that integrates the Genetic Algorithm with imitation learning. The involvement
of the Genetic Algorithm improves the data efficiency by reproducing
trajectories with various returns and assists the model in estimating more
accurate and compact reward function parameters. We tested GenIL in both Atari
and Mujoco domains, and the result shows that it successfully outperforms the
previous extrapolation methods over extrapolation accuracy, robustness, and
overall policy performance when input data is limited.
Related papers
- Shortening the Trajectories: Identity-Aware Gaussian Approximation for Efficient 3D Molecular Generation [2.631060597686179]
Probabilistic Generative Models (GPGMs) generate data by reversing a process that corrupts samples with Gaussian noise.<n>These models have achieved state-of-the-art performance across diverse domains, but their practical deployment remains constrained by the high computational cost.<n>We introduce a theoretically grounded and empirically validated framework that improves generation efficiency without sacrificing training granularity or inference fidelity.
arXiv Detail & Related papers (2025-07-11T21:39:32Z) - Active Learning for Manifold Gaussian Process Regression [5.618322163107168]
This paper introduces an active learning framework for manifold Gaussian Process (GP) regression.<n>It combines manifold learning with strategic data selection to improve accuracy in high-dimensional spaces.
arXiv Detail & Related papers (2025-06-26T01:25:39Z) - DeepGDel: Deep Learning-based Gene Deletion Prediction Framework for Growth-Coupled Production in Genome-Scale Metabolic Models [0.46551592572821365]
We propose a framework for predicting gene deletion strategies for growth-coupled production in genome-scale metabolic models.
The proposed framework leverages deep learning algorithms to learn and integrate sequential gene and metabolite data representation.
Experiment results demonstrate the feasibility of the proposed framework, showing substantial improvements over the baseline method.
arXiv Detail & Related papers (2025-04-08T08:07:59Z) - PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models [0.5499796332553708]
Deep generative models (DGMs) have caused a paradigm shift in the field of machine learning.
A comprehensive evaluation of these models that accounts for the trichotomy between fidelity, diversity, and novelty in generated samples remains a formidable challenge.
We propose PALATE, a novel enhancement to the evaluation of DGMs that addresses limitations of existing metrics.
arXiv Detail & Related papers (2025-03-24T09:06:45Z) - Efficient Data Selection for Training Genomic Perturbation Models [8.362190332905524]
Gene perturbation models based on graph neural networks are trained to predict the outcomes of gene perturbations.<n>Active learning is often employed to train these models, alternating between wet-lab experiments and model updates.<n>We propose a graph-based data filtering method that selects the gene perturbations in one shot and in a model-free manner.
arXiv Detail & Related papers (2025-03-18T12:52:03Z) - U-aggregation: Unsupervised Aggregation of Multiple Learning Algorithms [4.871473117968554]
We propose an unsupervised model aggregation method, U-aggregation, for enhanced and robust performance in new populations.
Unlike existing supervised model aggregation or super learner approaches, U-aggregation assumes no observed labels or outcomes in the target population.
We demonstrate its potential real-world application by using U-aggregation to enhance genetic risk prediction of complex traits.
arXiv Detail & Related papers (2025-01-30T01:42:51Z) - Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration [15.463313629574111]
This paper investigates how to achieve sample-efficient exploration in continuous control tasks.
We introduce an RL algorithm that incorporates a predictive model and off-policy learning elements.
We derive an intrinsic reward without incurring parameters overhead.
arXiv Detail & Related papers (2024-03-31T11:39:11Z) - Genetic-guided GFlowNets for Sample Efficient Molecular Optimization [33.270494123656746]
Recent advances in deep learning-based generative methods have shown promise but face the issue of sample efficiency.
This paper proposes a novel algorithm for sample-efficient molecular optimization by distilling a powerful genetic algorithm into deep generative policy.
arXiv Detail & Related papers (2024-02-05T04:12:40Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states.
We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z) - Missingness Augmentation: A General Approach for Improving Generative
Imputation Models [20.245637164975594]
We propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models.
As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks.
Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models.
arXiv Detail & Related papers (2021-07-31T08:51:46Z) - Behavior-based Neuroevolutionary Training in Reinforcement Learning [3.686320043830301]
This work presents a hybrid algorithm that combines neuroevolutionary optimization with value-based reinforcement learning.
For this purpose, we consolidate different methods to generate and optimize agent policies, creating a diverse population.
Our results indicate that combining methods can enhance the sample efficiency and learning speed for evolutionary approaches.
arXiv Detail & Related papers (2021-05-17T15:40:42Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Reparameterized Variational Divergence Minimization for Stable Imitation [57.06909373038396]
We study the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms.
We contribute a re parameterization trick for adversarial imitation learning to alleviate the challenges of the promising $f$-divergence minimization framework.
Empirically, we demonstrate that our design choices allow for ILO algorithms that outperform baseline approaches and more closely match expert performance in low-dimensional continuous-control tasks.
arXiv Detail & Related papers (2020-06-18T19:04:09Z) - Active Learning for Gaussian Process Considering Uncertainties with
Application to Shape Control of Composite Fuselage [7.358477502214471]
We propose two new active learning algorithms for the Gaussian process with uncertainties.
We show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance.
This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage.
arXiv Detail & Related papers (2020-04-23T02:04:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.