Beyond Worst-case Attacks: Robust RL with Adaptive Defense via
Non-dominated Policies
- URL: http://arxiv.org/abs/2402.12673v1
- Date: Tue, 20 Feb 2024 02:45:20 GMT
- Title: Beyond Worst-case Attacks: Robust RL with Adaptive Defense via
Non-dominated Policies
- Authors: Xiangyu Liu, Chenghao Deng, Yanchao Sun, Yongyuan Liang, Furong Huang
- Abstract summary: We study policy robustness under the well-accepted state-adrial attack model.
We propose a novel training-time algorithm to iteratively discover textitnon-versadominated policies.
Empirical validation on the Mujoco subroutine corroborates the superiority of our approach in terms of natural and robust performance.
- Score: 42.709038827974375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In light of the burgeoning success of reinforcement learning (RL) in diverse
real-world applications, considerable focus has been directed towards ensuring
RL policies are robust to adversarial attacks during test time. Current
approaches largely revolve around solving a minimax problem to prepare for
potential worst-case scenarios. While effective against strong attacks, these
methods often compromise performance in the absence of attacks or the presence
of only weak attacks. To address this, we study policy robustness under the
well-accepted state-adversarial attack model, extending our focus beyond only
worst-case attacks. We first formalize this task at test time as a regret
minimization problem and establish its intrinsic hardness in achieving
sublinear regret when the baseline policy is from a general continuous policy
class, $\Pi$. This finding prompts us to \textit{refine} the baseline policy
class $\Pi$ prior to test time, aiming for efficient adaptation within a finite
policy class $\Tilde{\Pi}$, which can resort to an adversarial bandit
subroutine. In light of the importance of a small, finite $\Tilde{\Pi}$, we
propose a novel training-time algorithm to iteratively discover
\textit{non-dominated policies}, forming a near-optimal and minimal
$\Tilde{\Pi}$, thereby ensuring both robustness and test-time efficiency.
Empirical validation on the Mujoco corroborates the superiority of our approach
in terms of natural and robust performance, as well as adaptability to various
attack scenarios.
Related papers
- Belief-Enriched Pessimistic Q-Learning against Adversarial State
Perturbations [5.076419064097735]
Recent work shows that a well-trained RL agent can be easily manipulated by strategically perturbing its state observations at the test stage.
Existing solutions either introduce a regularization term to improve the smoothness of the trained policy against perturbations or alternatively train the agent's policy and the attacker's policy.
We propose a new robust RL algorithm for deriving a pessimistic policy to safeguard against an agent's uncertainty about true states.
arXiv Detail & Related papers (2024-03-06T20:52:49Z) - Implicit Poisoning Attacks in Two-Agent Reinforcement Learning:
Adversarial Policies for Training-Time Attacks [21.97069271045167]
In targeted poisoning attacks, an attacker manipulates an agent-environment interaction to force the agent into adopting a policy of interest, called target policy.
We study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer.
We develop an optimization framework for designing optimal attacks, where the cost of the attack measures how much the solution deviates from the assumed default policy of the peer agent.
arXiv Detail & Related papers (2023-02-27T14:52:15Z) - Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret
Learning in Markov Games [95.10091348976779]
We study decentralized policy learning in Markov games where we control a single agent to play with nonstationary and possibly adversarial opponents.
We propose a new algorithm, underlineDecentralized underlineOptimistic hypeunderlineRpolicy munderlineIrror deunderlineScent (DORIS)
DORIS achieves $sqrtK$-regret in the context of general function approximation, where $K$ is the number of episodes.
arXiv Detail & Related papers (2022-06-03T14:18:05Z) - Attacking and Defending Deep Reinforcement Learning Policies [3.6985039575807246]
We study robustness of DRL policies to adversarial attacks from the perspective of robust optimization.
We propose a greedy attack algorithm, which tries to minimize the expected return of the policy without interacting with the environment, and a defense algorithm, which performs adversarial training in a max-min form.
arXiv Detail & Related papers (2022-05-16T12:47:54Z) - Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack [96.50202709922698]
A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable.
We propose a parameter-free Adaptive Auto Attack (A$3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion.
arXiv Detail & Related papers (2022-03-10T04:53:54Z) - School of hard knocks: Curriculum analysis for Pommerman with a fixed
computational budget [4.726777092009554]
Pommerman is a hybrid cooperative/adversarial multi-agent environment.
This makes it a challenging environment for reinforcement learning approaches.
We develop a curriculum for learning a robust and promising policy in a constrained computational budget of 100,000 games.
arXiv Detail & Related papers (2021-02-23T15:43:09Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - DDPG++: Striving for Simplicity in Continuous-control Off-Policy
Reinforcement Learning [95.60782037764928]
We show that simple Deterministic Policy Gradient works remarkably well as long as the overestimation bias is controlled.
Second, we pinpoint training instabilities, typical of off-policy algorithms, to the greedy policy update step.
Third, we show that ideas in the propensity estimation literature can be used to importance-sample transitions from replay buffer and update policy to prevent deterioration of performance.
arXiv Detail & Related papers (2020-06-26T20:21:12Z) - Robust Deep Reinforcement Learning against Adversarial Perturbations on
State Observations [88.94162416324505]
A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises.
Since the observations deviate from the true states, they can mislead the agent into making suboptimal actions.
We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, is ineffective for many RL tasks.
arXiv Detail & Related papers (2020-03-19T17:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.