Related papers: Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula

Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula

URL: http://arxiv.org/abs/2311.01642v1
Date: Fri, 3 Nov 2023 00:00:32 GMT
Title: Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula
Authors: Aryaman Reddi, Maximilian T\"olle, Jan Peters, Georgia Chalvatzaki, Carlo D'Eramo
Abstract summary: Adversarial Reinforcement Learning trains a protagonist against destabilizing forces exercised by an adversary in a competitive zero-sum Markov game. Finding Nash equilibria requires facing complex saddle point optimization problems, which can be prohibitive to solve. We propose a novel approach for adversarial RL based on entropy regularization to ease the complexity of the saddle point optimization problem.
Score: 23.80052541774509
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robustness against adversarial attacks and distribution shifts is a long-standing goal of Reinforcement Learning (RL). To this end, Robust Adversarial Reinforcement Learning (RARL) trains a protagonist against destabilizing forces exercised by an adversary in a competitive zero-sum Markov game, whose optimal solution, i.e., rational strategy, corresponds to a Nash equilibrium. However, finding Nash equilibria requires facing complex saddle point optimization problems, which can be prohibitive to solve, especially for high-dimensional control. In this paper, we propose a novel approach for adversarial RL based on entropy regularization to ease the complexity of the saddle point optimization problem. We show that the solution of this entropy-regularized problem corresponds to a Quantal Response Equilibrium (QRE), a generalization of Nash equilibria that accounts for bounded rationality, i.e., agents sometimes play random actions instead of optimal ones. Crucially, the connection between the entropy-regularized objective and QRE enables free modulation of the rationality of the agents by simply tuning the temperature coefficient. We leverage this insight to propose our novel algorithm, Quantal Adversarial RL (QARL), which gradually increases the rationality of the adversary in a curriculum fashion until it is fully rational, easing the complexity of the optimization problem while retaining robustness. We provide extensive evidence of QARL outperforming RARL and recent baselines across several MuJoCo locomotion and navigation problems in overall performance and robustness.

Related papers

Vairiational Stochastic Games [1.6703448188585752]
We propose a novel variational inference framework tailored to decentralized multi-agent systems. Our framework addresses the challenges posed by non-stationarity and unaligned agent objectives. We demonstrate theoretical convergence guarantees for the proposed decentralized algorithms.
arXiv Detail & Related papers (2025-03-08T03:21:23Z)
Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach [2.3020018305241337]
This paper is the first to propose considering the RRL problems within the positional differential game theory. Namely, we prove that under Isaacs's condition, the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations. We present the Isaacs Deep Q-Network algorithms and demonstrate their superiority compared to other baseline RRL and Multi-Agent RL algorithms in various environments.
arXiv Detail & Related papers (2024-05-03T12:21:43Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations [98.5802673062712]
We introduce temporally-coupled perturbations, presenting a novel challenge for existing robust reinforcement learning methods. We propose GRAD, a novel game-theoretic approach that treats the temporally-coupled robust RL problem as a partially observable two-player zero-sum game.
arXiv Detail & Related papers (2023-07-22T12:10:04Z)
Light Unbalanced Optimal Transport [69.18220206873772]
Existing solvers are either based on principles or heavy-weighted with complex optimization objectives involving several neural networks. We show that our solver provides a universal approximation of UEOT solutions and obtains its generalization bounds.
arXiv Detail & Related papers (2023-03-14T15:44:40Z)
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games. We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method. Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z)
Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training [43.97565851415018]
Robust Reinforcement Learning (RL) focuses on improving performances under model errors or adversarial attacks. Most of the existing literature models RARL as a zero-sum simultaneous game with Nash equilibrium as the solution concept. We introduce a novel hierarchical formulation of robust RL - a general-sum Stackelberg game model called RRL-Stack.
arXiv Detail & Related papers (2022-02-19T03:44:05Z)
Mind Your Solver! On Adversarial Attack and Defense for Combinatorial Optimization [111.78035414744045]
We take an initiative on developing the mechanisms for adversarial attack and defense towards optimization solvers. We present a simple yet effective defense strategy to modify the graph structure to increase the robustness of solvers.
arXiv Detail & Related papers (2021-12-28T15:10:15Z)
Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization [40.21627891283402]
This paper investigates the problem of computing the equilibrium of competitive games. Motivated by the algorithmic role of entropy regularization, we develop provably efficient extragradient methods.
arXiv Detail & Related papers (2021-05-31T17:51:15Z)
Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z)
Robust Reinforcement Learning using Adversarial Populations [118.73193330231163]
Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness. We show that using a single adversary does not consistently yield robustness to dynamics variations under standard parametrizations of the adversary. We propose a population-based augmentation to the Robust RL formulation in which we randomly initialize a population of adversaries and sample from the population uniformly during training.
arXiv Detail & Related papers (2020-08-04T20:57:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.