Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning
- URL: http://arxiv.org/abs/2510.12939v1
- Date: Tue, 14 Oct 2025 19:35:27 GMT
- Title: Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning
- Authors: James Pedley, Benjamin Etheridge, Stephen J. Roberts, Francesco Quinzan,
- Abstract summary: We develop the first theoretical framework for certified robustness under pruning in state-adversarial Markov decision processes.<n>We derive a novel three-term regret decomposition that disentangles clean-task performance, pruning-induced performance loss, and robustness gains.
- Score: 6.883578421923203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) policies deployed in real-world environments must remain reliable under adversarial perturbations. At the same time, modern deep RL agents are heavily over-parameterized, raising costs and fragility concerns. While pruning has been shown to improve robustness in supervised learning, its role in adversarial RL remains poorly understood. We develop the first theoretical framework for certified robustness under pruning in state-adversarial Markov decision processes (SA-MDPs). For Gaussian and categorical policies with Lipschitz networks, we prove that element-wise pruning can only tighten certified robustness bounds; pruning never makes the policy less robust. Building on this, we derive a novel three-term regret decomposition that disentangles clean-task performance, pruning-induced performance loss, and robustness gains, exposing a fundamental performance--robustness frontier. Empirically, we evaluate magnitude and micro-pruning schedules on continuous-control benchmarks with strong policy-aware adversaries. Across tasks, pruning consistently uncovers reproducible ``sweet spots'' at moderate sparsity levels, where robustness improves substantially without harming - and sometimes even enhancing - clean performance. These results position pruning not merely as a compression tool but as a structural intervention for robust RL.
Related papers
- When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift [64.37959940809633]
We study robustness of Proximal Policy Optimization (PPO) under temporally persistent sensor failures.<n>We show Transformer-based sequence policies substantially outperform, RNN, and SSMs in robustness, maintaining high returns even when large fractions of sensors are unavailable.
arXiv Detail & Related papers (2026-03-04T22:21:54Z) - Formal Synthesis of Certifiably Robust Neural Lyapunov-Barrier Certificates [9.62123513414546]
We study the problem of synthesizing emphrobust neural Lyapunov barrier certificates that maintain their guarantees under perturbations in system dynamics.<n>We propose practical training objectives that enforce these conditions via adversarial training, Lipschitz neighborhood bound, and global Lipschitz regularization.<n>Our results demonstrate effectiveness of training robust neural certificates for safe RL under perturbations in dynamics.
arXiv Detail & Related papers (2026-02-05T05:08:01Z) - Distributionally Robust Self Paced Curriculum Reinforcement Learning [42.51809641161819]
We propose Distributionally Robust Self-Paced Curriculum Reinforcement Learning (DR-SPCRL)<n>DR-SPCRL adaptively schedules the robustness budget according to the agent's progress, enabling a balance between nominal and robust performance.<n> Empirical results across multiple environments demonstrate that DR-SPCRL not only stabilizes training but also achieves a superior robustness-performance trade-off.
arXiv Detail & Related papers (2025-11-07T20:25:43Z) - Robust Multi-Agent Reinforcement Learning via Adversarial
Regularization: Theoretical Foundation and Stable Algorithms [79.61176746380718]
Multi-Agent Reinforcement Learning (MARL) has shown promising results across several domains.
MARL policies often lack robustness and are sensitive to small changes in their environment.
We show that we can gain robustness by controlling a policy's Lipschitz constant.
We propose a new robust MARL framework, ERNIE, that promotes the Lipschitz continuity of the policies.
arXiv Detail & Related papers (2023-10-16T20:14:06Z) - Robust Reinforcement Learning in Continuous Control Tasks with
Uncertainty Set Regularization [17.322284328945194]
Reinforcement learning (RL) is recognized as lacking generalization and robustness under environmental perturbations.
We propose a new regularizer named $textbfU$ncertainty $textbfS$et $textbfR$egularizer (USR)
arXiv Detail & Related papers (2022-07-05T12:56:08Z) - COPA: Certifying Robust Policies for Offline Reinforcement Learning
against Poisoning Attacks [49.15885037760725]
We focus on certifying the robustness of offline reinforcement learning (RL) in the presence of poisoning attacks.
We propose the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated.
We prove that some of the proposed certification methods are theoretically tight and some are NP-Complete problems.
arXiv Detail & Related papers (2022-03-16T05:02:47Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - CROP: Certifying Robust Policies for Reinforcement Learning through
Functional Smoothing [41.093241772796475]
We present the first framework of Certifying Robust Policies for reinforcement learning (CROP) against adversarial state perturbations.
We propose two types of robustness certification criteria: robustness of per-state actions and lower bound of cumulative rewards.
arXiv Detail & Related papers (2021-06-17T07:58:32Z) - Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning [63.53407136812255]
Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.
Existing Q-learning and actor-critic based off-policy RL algorithms fail when bootstrapping from out-of-distribution (OOD) actions or states.
We propose Uncertainty Weighted Actor-Critic (UWAC), an algorithm that detects OOD state-action pairs and down-weights their contribution in the training objectives accordingly.
arXiv Detail & Related papers (2021-05-17T20:16:46Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Robust Deep Reinforcement Learning against Adversarial Perturbations on
State Observations [88.94162416324505]
A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises.
Since the observations deviate from the true states, they can mislead the agent into making suboptimal actions.
We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, is ineffective for many RL tasks.
arXiv Detail & Related papers (2020-03-19T17:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.