Masked Images Are Counterfactual Samples for Robust Fine-tuning
- URL: http://arxiv.org/abs/2303.03052v3
- Date: Sun, 2 Apr 2023 13:33:20 GMT
- Title: Masked Images Are Counterfactual Samples for Robust Fine-tuning
- Authors: Yao Xiao, Ziyi Tang, Pengxu Wei, Cong Liu, Liang Lin
- Abstract summary: Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
- Score: 77.82348472169335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models are challenged by the distribution shift between the
training data and test data. Recently, the large models pre-trained on diverse
data have demonstrated unprecedented robustness to various distribution shifts.
However, fine-tuning these models can lead to a trade-off between
in-distribution (ID) performance and out-of-distribution (OOD) robustness.
Existing methods for tackling this trade-off do not explicitly address the OOD
robustness problem. In this paper, based on causal analysis of the
aforementioned problems, we propose a novel fine-tuning method, which uses
masked images as counterfactual samples that help improve the robustness of the
fine-tuning model. Specifically, we mask either the semantics-related or
semantics-unrelated patches of the images based on class activation map to
break the spurious correlation, and refill the masked patches with patches from
other images. The resulting counterfactual samples are used in feature-based
distillation with the pre-trained model. Extensive experiments verify that
regularizing the fine-tuning with the proposed masked images can achieve a
better trade-off between ID and OOD performance, surpassing previous methods on
the OOD performance. Our code is available at
https://github.com/Coxy7/robust-finetuning.
Related papers
- Model Integrity when Unlearning with T2I Diffusion Models [11.321968363411145]
We propose approximate Machine Unlearning algorithms to reduce the generation of specific types of images, characterized by samples from a forget distribution''
We then propose unlearning algorithms that demonstrate superior effectiveness in preserving model integrity compared to existing baselines.
arXiv Detail & Related papers (2024-11-04T13:15:28Z) - Can Your Generative Model Detect Out-of-Distribution Covariate Shift? [2.0144831048903566]
We propose a novel method for detecting Out-of-Distribution (OOD) sensory data using conditional Normalizing Flows (cNFs)
Our results on CIFAR10 vs. CIFAR10-C and ImageNet200 vs. ImageNet200-C demonstrate the effectiveness of the method.
arXiv Detail & Related papers (2024-09-04T19:27:56Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using
Pre-trained Diffusion Models [25.58447344260747]
We use pre-trained diffusion models for semantic mismatch-guided OOD detection, named DiffGuard.
Experiments show that DiffGuard is effective on both Cifar-10 and hard cases of the large-scale ImageNet.
It can be easily combined with existing OOD detection techniques to achieve state-of-the-art OOD detection results.
arXiv Detail & Related papers (2023-08-15T10:37:04Z) - ExposureDiffusion: Learning to Expose for Low-light Image Enhancement [87.08496758469835]
This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure model.
Our method obtains significantly improved performance and reduced inference time compared with vanilla diffusion models.
The proposed framework can work with both real-paired datasets, SOTA noise models, and different backbone networks.
arXiv Detail & Related papers (2023-07-15T04:48:35Z) - Learning to Mask and Permute Visual Tokens for Vision Transformer
Pre-Training [59.923672191632065]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT)
MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies.
Our results demonstrate that MaPeT achieves competitive performance on ImageNet.
arXiv Detail & Related papers (2023-06-12T18:12:19Z) - Effective Robustness against Natural Distribution Shifts for Models with
Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance.
We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z) - Deep Learning-Based Defect Classification and Detection in SEM Images [1.9206693386750882]
In particular, we train RetinaNet models using different ResNet, VGGNet architectures as backbone.
We propose a preference-based ensemble strategy to combine the output predictions from different models in order to achieve better performance on classification and detection of defects.
arXiv Detail & Related papers (2022-06-20T16:34:11Z) - A Hierarchical Transformation-Discriminating Generative Model for Few
Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image.
The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z) - Uncertainty-aware Generalized Adaptive CycleGAN [44.34422859532988]
Unpaired image-to-image translation refers to learning inter-image-domain mapping in an unsupervised manner.
Existing methods often learn deterministic mappings without explicitly modelling the robustness to outliers or predictive uncertainty.
We propose a novel probabilistic method called Uncertainty-aware Generalized Adaptive Cycle Consistency (UGAC)
arXiv Detail & Related papers (2021-02-23T15:22:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.