Related papers: On the Benefits of Inducing Local Lipschitzness for Robust Generative Adversarial Imitation Learning

On the Benefits of Inducing Local Lipschitzness for Robust Generative Adversarial Imitation Learning

URL: http://arxiv.org/abs/2107.00116v3
Date: Mon, 15 Jan 2024 20:05:14 GMT
Title: On the Benefits of Inducing Local Lipschitzness for Robust Generative Adversarial Imitation Learning
Authors: Farzan Memarian, Abolfazl Hashemi, Scott Niekum, Ufuk Topcu
Abstract summary: We study the effect of local Lipschitzness of the discriminator and the generator on the robustness of policies learned by GAIL. We show that the modified objective leads to learning significantly more robust policies.
Score: 36.48610705372544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore methodologies to improve the robustness of generative adversarial imitation learning (GAIL) algorithms to observation noise. Towards this objective, we study the effect of local Lipschitzness of the discriminator and the generator on the robustness of policies learned by GAIL. In many robotics applications, the learned policies by GAIL typically suffer from a degraded performance at test time since the observations from the environment might be corrupted by noise. Hence, robustifying the learned policies against the observation noise is of critical importance. To this end, we propose a regularization method to induce local Lipschitzness in the generator and the discriminator of adversarial imitation learning methods. We show that the modified objective leads to learning significantly more robust policies. Moreover, we demonstrate -- both theoretically and experimentally -- that training a locally Lipschitz discriminator leads to a locally Lipschitz generator, thereby improving the robustness of the resultant policy. We perform extensive experiments on simulated robot locomotion environments from the MuJoCo suite that demonstrate the proposed method learns policies that significantly outperform the state-of-the-art generative adversarial imitation learning algorithm when applied to test scenarios with noise-corrupted observations.

Related papers

Robust Behavior Cloning Via Global Lipschitz Regularization [0.5767156832161817]
Behavior Cloning is an effective imitation learning technique and has even been adopted in some safety-critical domains such as autonomous vehicles.<n>We use a global Lipschitz regularization approach to enhance the robustness of the learned policy network.<n>We propose a way to construct a Lipschitz neural network that ensures the policy robustness.
arXiv Detail & Related papers (2025-06-24T02:19:08Z)
Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling. We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z)
Certifiably Robust Reinforcement Learning through Model-Based Abstract Interpretation [10.69970450827617]
We present a reinforcement learning framework in which the learned policy comes with a machine-checkable certificate of provable adversarial robustness. We experimentally evaluate CAROL on four MuJoCo environments with continuous state and action spaces. CAROL learns policies that, when contrasted with policies from the state-of-the-art robust RL algorithms, exhibit: (i) markedly enhanced certified performance lower bounds; and (ii) comparable performance under empirical adversarial attacks.
arXiv Detail & Related papers (2023-01-26T19:42:58Z)
Risk-Sensitive Reinforcement Learning with Exponential Criteria [0.0]
We provide a definition of robust reinforcement learning policies and formulate a risk-sensitive reinforcement learning problem to approximate them. We introduce a novel online Actor-Critic algorithm based on solving a multiplicative Bellman equation using approximation updates. The implementation, performance, and robustness properties of the proposed methods are evaluated in simulated experiments.
arXiv Detail & Related papers (2022-12-18T04:44:38Z)
Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning. We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z)
Robust Learning from Observation with Model Misspecification [33.92371002674386]
Imitation learning (IL) is a popular paradigm for training policies in robotic systems. We propose a robust IL algorithm to learn policies that can effectively transfer to the real environment without fine-tuning.
arXiv Detail & Related papers (2022-02-12T07:04:06Z)
Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs. We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)
Robust Imitation Learning from Noisy Demonstrations [81.67837507534001]
We show that robust imitation learning can be achieved by optimizing a classification risk with a symmetric loss. We propose a new imitation learning method that effectively combines pseudo-labeling with co-training. Experimental results on continuous-control benchmarks show that our method is more robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-20T10:41:37Z)
Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state. reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle. In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.