Scaling Adversarial Training via Data Selection
- URL: http://arxiv.org/abs/2512.22069v1
- Date: Fri, 26 Dec 2025 15:50:33 GMT
- Title: Scaling Adversarial Training via Data Selection
- Authors: Youran Ye, Dejin Wang, Ajinkya Bhandare,
- Abstract summary: We propose emphSelective Adversa Training, which perturbs only a subset of critical samples in each minibatch.<n>Experiments on MNIST and CIFAR-10 show that the proposed methods achieve robustness comparable to, or even exceeding, full PGD adversarial training.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Projected Gradient Descent (PGD) is a strong and widely used first-order adversarial attack, yet its computational cost scales poorly, as all training samples undergo identical iterative inner-loop optimization despite contributing unequally to robustness. Motivated by this inefficiency, we propose \emph{Selective Adversarial Training}, which perturbs only a subset of critical samples in each minibatch. Specifically, we introduce two principled selection criteria: (1) margin-based sampling, which prioritizes samples near the decision boundary, and (2) gradient-matching sampling, which selects samples whose gradients align with the dominant batch optimization direction. Adversarial examples are generated only for the selected subset, while the remaining samples are trained cleanly using a mixed objective. Experiments on MNIST and CIFAR-10 show that the proposed methods achieve robustness comparable to, or even exceeding, full PGD adversarial training, while reducing adversarial computation by up to $50\%$, demonstrating that informed sample selection is sufficient for scalable adversarial robustness.
Related papers
- Not All Candidates are Created Equal: A Heterogeneity-Aware Approach to Pre-ranking in Recommender Systems [11.849498011182066]
Heterogeneity-Aware Adaptive Pre-ranking (HAP) is a unified framework that mitigates gradient conflicts through conflict-sensitive sampling.<n>HAP has been deployed in the Toutiao production system for 9 months, yielding up to 0.4% improvement in user app usage duration.
arXiv Detail & Related papers (2026-03-04T06:27:47Z) - Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning [71.30276778807068]
We propose a unified framework that strategically coordinates sample pruning and token pruning.<n>Q-Tuning achieves a +38% average improvement over the full-data SFT baseline using only 12.5% of the original training data.
arXiv Detail & Related papers (2025-09-28T13:27:38Z) - Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples [74.60723854735237]
We show that mislabeled examples correctly predicted by the model early in the training process are particularly harmful to model performance.<n>We propose Early Cutting, which employs the model's later training state to re-select the confident subset identified early in training.
arXiv Detail & Related papers (2025-02-12T09:12:45Z) - ANNE: Adaptive Nearest Neighbors and Eigenvector-based Sample Selection for Robust Learning with Noisy Labels [7.897299759691143]
This paper introduces the Adaptive Nearest Neighbors and Eigenvector-based (ANNE) sample selection methodology.
ANNE integrates loss-based sampling with the feature-based sampling methods FINE and Adaptive KNN to optimize performance across a wide range of noise rate scenarios.
arXiv Detail & Related papers (2024-11-03T15:51:38Z) - Batch-in-Batch: a new adversarial training framework for initial perturbation and sample selection [9.241737058291823]
Adrial training methods generate independent initial perturbation for adversarial samples from a simple uniform distribution.
We propose a simple yet effective training framework called Batch-in-Batch to enhance models.
We show that models trained within the BB framework consistently have higher adversarial accuracy across various adversarial settings.
arXiv Detail & Related papers (2024-06-06T13:34:43Z) - Understanding and Mitigating the Bias in Sample Selection for Learning with Noisy Labels [22.24077757409148]
We propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.<n>Specifically, to mitigate the training bias, we design a robust network architecture that integrates with multiple experts.<n>By training on the mixture of two class-discriminative mini-batches, the model mitigates the effect of the imbalanced training set.
arXiv Detail & Related papers (2024-01-24T10:37:28Z) - Data Pruning via Moving-one-Sample-out [61.45441981346064]
We propose a novel data-pruning approach called moving-one-sample-out (MoSo)
MoSo aims to identify and remove the least informative samples from the training set.
Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
arXiv Detail & Related papers (2023-10-23T08:00:03Z) - Selecting Learnable Training Samples is All DETRs Need in Crowded
Pedestrian Detection [72.97320260601347]
In crowded pedestrian detection, the performance of DETRs is still unsatisfactory due to the inappropriate sample selection method.
We propose Sample Selection for Crowded Pedestrians, which consists of the constraint-guided label assignment scheme (CGLA)
Experimental results show that the proposed SSCP effectively improves the baselines without introducing any overhead in inference.
arXiv Detail & Related papers (2023-05-18T08:28:01Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.