Implicit Counterfactual Data Augmentation for Deep Neural Networks
- URL: http://arxiv.org/abs/2304.13431v1
- Date: Wed, 26 Apr 2023 10:36:40 GMT
- Title: Implicit Counterfactual Data Augmentation for Deep Neural Networks
- Authors: Xiaoling Zhou, Ou Wu
- Abstract summary: Machine-learning models are prone to capturing spurious correlations between non-causal attributes and classes.
This study proposes an implicit counterfactual data augmentation method to remove spurious correlations and make stable predictions.
- Score: 3.6397924689580745
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine-learning models are prone to capturing the spurious correlations
between non-causal attributes and classes, with counterfactual data
augmentation being a promising direction for breaking these spurious
associations. However, explicitly generating counterfactual data is
challenging, with the training efficiency declining. Therefore, this study
proposes an implicit counterfactual data augmentation (ICDA) method to remove
spurious correlations and make stable predictions. Specifically, first, a novel
sample-wise augmentation strategy is developed that generates semantically and
counterfactually meaningful deep features with distinct augmentation strength
for each sample. Second, we derive an easy-to-compute surrogate loss on the
augmented feature set when the number of augmented samples becomes infinite.
Third, two concrete schemes are proposed, including direct quantification and
meta-learning, to derive the key parameters for the robust loss. In addition,
ICDA is explained from a regularization aspect, with extensive experiments
indicating that our method consistently improves the generalization performance
of popular depth networks on multiple typical learning scenarios that require
out-of-distribution generalization.
Related papers
- Boosting Model Resilience via Implicit Adversarial Data Augmentation [20.768174896574916]
We propose to augment the deep features of samples by incorporating adversarial and anti-adversarial perturbation distributions.
We then theoretically reveal that our augmentation process approximates the optimization of a surrogate loss function.
We conduct extensive experiments across four common biased learning scenarios.
arXiv Detail & Related papers (2024-04-25T03:22:48Z) - Semi-Supervised Learning via Swapped Prediction for Communication Signal
Recognition [11.325643693823828]
Training a deep neural network on small datasets with few labels generally falls into overfitting, resulting in degenerated performance.
We develop a semi-supervised learning (SSL) method that effectively utilizes a large collection of more readily available unlabeled signal data to improve generalization.
arXiv Detail & Related papers (2023-11-14T14:08:55Z) - Regularization Through Simultaneous Learning: A Case Study on Plant
Classification [0.0]
This paper introduces Simultaneous Learning, a regularization approach drawing on principles of Transfer Learning and Multi-task Learning.
We leverage auxiliary datasets with the target dataset, the UFOP-HVD, to facilitate simultaneous classification guided by a customized loss function.
Remarkably, our approach demonstrates superior performance over models without regularization.
arXiv Detail & Related papers (2023-05-22T19:44:57Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Predicting Deep Neural Network Generalization with Perturbation Response
Curves [58.8755389068888]
We propose a new framework for evaluating the generalization capabilities of trained networks.
Specifically, we introduce two new measures for accurately predicting generalization gaps.
We attain better predictive scores than the current state-of-the-art measures on a majority of tasks in the Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition.
arXiv Detail & Related papers (2021-06-09T01:37:36Z) - Self-paced Data Augmentation for Training Neural Networks [11.554821454921536]
We propose a self-paced augmentation to automatically select suitable samples for data augmentation when training a neural network.
The proposed method mitigates the deterioration of generalization performance caused by ineffective data augmentation.
Experimental results demonstrate that the proposed SPA can improve the generalization performance, particularly when the number of training samples is small.
arXiv Detail & Related papers (2020-10-29T09:13:18Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Regularizing Deep Networks with Semantic Data Augmentation [44.53483945155832]
We propose a novel semantic data augmentation algorithm to complement traditional approaches.
The proposed method is inspired by the intriguing property that deep networks are effective in learning linearized features.
We show that the proposed implicit semantic data augmentation (ISDA) algorithm amounts to minimizing a novel robust CE loss.
arXiv Detail & Related papers (2020-07-21T00:32:44Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.