Related papers: Robust Imitation Learning against Variations in Environment Dynamics

Robust Imitation Learning against Variations in Environment Dynamics

URL: http://arxiv.org/abs/2206.09314v1
Date: Sun, 19 Jun 2022 03:06:13 GMT
Title: Robust Imitation Learning against Variations in Environment Dynamics
Authors: Jongseong Chae, Seungyul Han, Whiyoung Jung, Myungsik Cho, Sungho Choi, Youngchul Sung
Abstract summary: We propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. Our framework effectively deals with environments with varying dynamics by imitating multiple experts in sampled environment dynamics.
Score: 17.15933046951096
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed. The existing IL framework trained in a single environment can catastrophically fail with perturbations in environment dynamics because it does not capture the situation that underlying environment dynamics can be changed. Our framework effectively deals with environments with varying dynamics by imitating multiple experts in sampled environment dynamics to enhance the robustness in general variations in environment dynamics. In order to robustly imitate the multiple sample experts, we minimize the risk with respect to the Jensen-Shannon divergence between the agent's policy and each of the sample experts. Numerical results show that our algorithm significantly improves robustness against dynamics perturbations compared to conventional IL baselines.

Related papers

A Behavior-Aware Approach for Deep Reinforcement Learning in Non-stationary Environments without Known Change Points [30.077746056549678]
This research introduces Behavior-Aware Detection and Adaptation (BADA), an innovative framework that merges environmental change detection with behavior adaptation. The key inspiration behind our method is that policies exhibit different global behaviors in changing environments. The results of a series of experiments demonstrate better performance relative to several current algorithms.
arXiv Detail & Related papers (2024-05-23T06:17:26Z)
Dynamic Quality-Diversity Search [2.4797200957733576]
This paper introduces a novel and generalisable Dynamic QD methodology that aims to keep the archive of past solutions updated in the case of environment changes. Secondly, we present a novel characterisation of dynamic environments that can be easily applied to well-known benchmarks, with minor interventions to move them from a static task to a dynamic one.
arXiv Detail & Related papers (2024-04-07T19:00:15Z)
HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind. This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z)
SpReME: Sparse Regression for Multi-Environment Dynamic Systems [6.7053978622785415]
We develop a method of sparse regression dubbed SpReME to discover the major dynamics that underlie multiple environments. We demonstrate that the proposed model captures the correct dynamics from multiple environments over four different dynamic systems with improved prediction performance.
arXiv Detail & Related papers (2023-02-12T15:45:50Z)
LEADS: Learning Dynamical Systems that Generalize Across Environments [12.024388048406587]
We propose LEADS, a novel framework that leverages the commonalities and discrepancies among known environments to improve model generalization. We show that this new setting can exploit knowledge extracted from environment-dependent data and improves generalization for both known and novel environments.
arXiv Detail & Related papers (2021-06-08T17:28:19Z)
Robust Reconfigurable Intelligent Surfaces via Invariant Risk and Causal Representations [55.50218493466906]
In this paper, the problem of robust reconfigurable intelligent surface (RIS) system design under changes in data distributions is investigated. Using the notion of invariant risk minimization (IRM), an invariant causal representation across multiple environments is used such that the predictor is simultaneously optimal for each environment. A neural network-based solution is adopted to seek the predictor and its performance is validated via simulations against an empirical risk minimization-based design.
arXiv Detail & Related papers (2021-05-04T21:36:31Z)
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design [121.73425076217471]
We propose Unsupervised Environment Design (UED), where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments. We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED) Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.
arXiv Detail & Related papers (2020-12-03T17:37:01Z)
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning [12.76337275628074]
In this work, we propose a variational dynamic model based on the conditional variational inference to model the multimodality andgenerativeity. We derive an upper bound of the negative log-likelihood of the environmental transition and use such an upper bound as the intrinsic reward for exploration. Our method outperforms several state-of-the-art environment model-based exploration approaches.
arXiv Detail & Related papers (2020-10-17T09:54:51Z)
Dynamic Regret of Policy Optimization in Non-stationary Environments [120.01408308460095]
We propose two model-free policy optimization algorithms, POWER and POWER++, and establish guarantees for their dynamic regret. We show that POWER++ improves over POWER on the second component of the dynamic regret by actively adapting to non-stationarity through prediction. To the best of our knowledge, our work is the first dynamic regret analysis of model-free RL algorithms in non-stationary environments.
arXiv Detail & Related papers (2020-06-30T23:34:37Z)
Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier. understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z)
Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences. We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.