Data Profiling for Adversarial Training: On the Ruin of Problematic Data
- URL: http://arxiv.org/abs/2102.07437v1
- Date: Mon, 15 Feb 2021 10:17:24 GMT
- Title: Data Profiling for Adversarial Training: On the Ruin of Problematic Data
- Authors: Chengyu Dong, Liyuan Liu, Jingbo Shang
- Abstract summary: Problems in adversarial training include robustness-accuracy trade-off, robust overfitting, and gradient masking.
We show that these problems share one common cause -- low quality samples in the dataset.
We find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated.
- Score: 27.11328449349065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multiple intriguing problems hover in adversarial training, including
robustness-accuracy trade-off, robust overfitting, and gradient masking, posing
great challenges to both reliable evaluation and practical deployment. Here, we
show that these problems share one common cause -- low quality samples in the
dataset. We first identify an intrinsic property of the data called problematic
score and then design controlled experiments to investigate its connections
with these problems. Specifically, we find that when problematic data is
removed, robust overfitting and gradient masking can be largely alleviated; and
robustness-accuracy trade-off is more prominent for a dataset containing highly
problematic data. These observations not only verify our intuition about data
quality but also open new opportunities to advance adversarial training.
Remarkably, simply removing problematic data from adversarial training, while
making the training set smaller, yields better robustness consistently with
different adversary settings, training methods, and neural architectures.
Related papers
- DataFreeShield: Defending Adversarial Attacks without Training Data [32.29186953320468]
We investigate the problem of data-free adversarial robustness, where we try to achieve robustness without accessing real data.
We propose DataFreeShield, which tackles the problem from two perspectives: surrogate dataset generation and adversarial training.
We show that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem.
arXiv Detail & Related papers (2024-06-21T20:24:03Z) - Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth? [45.875832406278214]
Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains.
This article proposes a prediction- and reputation-based truth discovery framework.
It can separate low-quality data from high-quality data in sensing tasks.
arXiv Detail & Related papers (2024-05-29T03:16:12Z) - Corrective Machine Unlearning [22.342035149807923]
We formalize Corrective Machine Unlearning as the problem of mitigating the impact of data affected by unknown manipulations on a trained model.
We find most existing unlearning methods, including retraining-from-scratch without the deletion set, require most of the manipulated data to be identified for effective corrective unlearning.
One approach, Selective Synaptic Dampening, achieves limited success, unlearning adverse effects with just a small portion of the manipulated samples in our setting.
arXiv Detail & Related papers (2024-02-21T18:54:37Z) - Building Manufacturing Deep Learning Models with Minimal and Imbalanced
Training Data Using Domain Adaptation and Data Augmentation [15.333573151694576]
We propose a novel domain adaptation (DA) approach to address the problem of labeled training data scarcity for a target learning task.
Our approach works for scenarios where the source dataset and the dataset available for the target learning task have same or different feature spaces.
We evaluate our combined approach using image data for wafer defect prediction.
arXiv Detail & Related papers (2023-05-31T21:45:34Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Understanding Robust Overfitting of Adversarial Training and Beyond [103.37117541210348]
We show that robust overfitting widely exists in adversarial training of deep networks.
We propose emphminimum loss constrained adversarial training (MLCAT)
In a minibatch, we learn large-loss data as usual, and adopt additional measures to increase the loss of the small-loss data.
arXiv Detail & Related papers (2022-06-17T10:25:17Z) - On Covariate Shift of Latent Confounders in Imitation and Reinforcement
Learning [69.48387059607387]
We consider the problem of using expert data with unobserved confounders for imitation and reinforcement learning.
We analyze the limitations of learning from confounded expert data with and without external reward.
We validate our claims empirically on challenging assistive healthcare and recommender system simulation tasks.
arXiv Detail & Related papers (2021-10-13T07:31:31Z) - Improving filling level classification with adversarial training [90.01594595780928]
We investigate the problem of classifying - from a single image - the level of content in a cup or a drinking glass.
We use adversarial training in a generic source dataset and then refine the training with a task-specific dataset.
We show that transfer learning with adversarial training in the source domain consistently improves the classification accuracy on the test set.
arXiv Detail & Related papers (2021-02-08T08:32:56Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features.
We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach.
Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.