Adversarial Robustness in Two-Stage Learning-to-Defer: Algorithms and Guarantees
- URL: http://arxiv.org/abs/2502.01027v1
- Date: Mon, 03 Feb 2025 03:44:35 GMT
- Title: Adversarial Robustness in Two-Stage Learning-to-Defer: Algorithms and Guarantees
- Authors: Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi,
- Abstract summary: Learning-to-Defer (L2D) facilitates optimal task allocation between AI systems and decision-makers.
This paper conducts the first comprehensive analysis of adversarial robustness in two-stage L2D frameworks.
We propose SARD, a robust, convex, deferral algorithm rooted in Bayes and $(mathcalR,mathcalG)$-consistency.
- Score: 3.6787328174619254
- License:
- Abstract: Learning-to-Defer (L2D) facilitates optimal task allocation between AI systems and decision-makers. Despite its potential, we show that current two-stage L2D frameworks are highly vulnerable to adversarial attacks, which can misdirect queries or overwhelm decision agents, significantly degrading system performance. This paper conducts the first comprehensive analysis of adversarial robustness in two-stage L2D frameworks. We introduce two novel attack strategies -- untargeted and targeted -- that exploit inherent structural vulnerabilities in these systems. To mitigate these threats, we propose SARD, a robust, convex, deferral algorithm rooted in Bayes and $(\mathcal{R},\mathcal{G})$-consistency. Our approach guarantees optimal task allocation under adversarial perturbations for all surrogates in the cross-entropy family. Extensive experiments on classification, regression, and multi-task benchmarks validate the robustness of SARD.
Related papers
- A Two-Stage Learning-to-Defer Approach for Multi-Task Learning [3.4289478404209826]
We introduce a novel Two-Stage Learning-to-Defer framework for multi-task learning that jointly addresses classification and regression tasks.
We validate our framework on two challenging tasks: object detection, where classification and regression are tightly coupled, and electronic health record analysis.
arXiv Detail & Related papers (2024-10-21T07:44:57Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Towards Adversarial Realism and Robust Learning for IoT Intrusion
Detection and Classification [0.0]
The Internet of Things (IoT) faces tremendous security challenges.
The increasing threat posed by adversarial attacks restates the need for reliable defense strategies.
This work describes the types of constraints required for an adversarial cyber-attack example to be realistic.
arXiv Detail & Related papers (2023-01-30T18:00:28Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Hidden Cost of Randomized Smoothing [72.93630656906599]
In this paper, we point out the side effects of current randomized smoothing.
Specifically, we articulate and prove two major points: 1) the decision boundaries of smoothed classifiers will shrink, resulting in disparity in class-wise accuracy; 2) applying noise augmentation in the training process does not necessarily resolve the shrinking issue due to the inconsistent learning objectives.
arXiv Detail & Related papers (2020-03-02T23:37:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.