Neural Networks for Learning Counterfactual G-Invariances from Single
Environments
- URL: http://arxiv.org/abs/2104.10105v1
- Date: Tue, 20 Apr 2021 16:35:35 GMT
- Title: Neural Networks for Learning Counterfactual G-Invariances from Single
Environments
- Authors: S Chandra Mouli and Bruno Ribeiro
- Abstract summary: neural networks are believed to have difficulties extrapolating beyond training data distribution.
This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity.
- Score: 13.848760376470038
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite -- or maybe because of -- their astonishing capacity to fit data,
neural networks are believed to have difficulties extrapolating beyond training
data distribution. This work shows that, for extrapolations based on finite
transformation groups, a model's inability to extrapolate is unrelated to its
capacity. Rather, the shortcoming is inherited from a learning hypothesis:
Examples not explicitly observed with infinitely many training examples have
underspecified outcomes in the learner's model. In order to endow neural
networks with the ability to extrapolate over group transformations, we
introduce a learning framework counterfactually-guided by the learning
hypothesis that any group invariance to (known) transformation groups is
mandatory even without evidence, unless the learner deems it inconsistent with
the training data. Unlike existing invariance-driven methods for
(counterfactual) extrapolations, this framework allows extrapolations from a
single environment. Finally, we introduce sequence and image extrapolation
tasks that validate our framework and showcase the shortcomings of traditional
approaches.
Related papers
- Environment Diversification with Multi-head Neural Network for Invariant
Learning [7.255121332331688]
This work proposes EDNIL, an invariant learning framework containing a multi-head neural network to absorb data biases.
We show that this framework does not require prior knowledge about environments or strong assumptions about the pre-trained model.
We demonstrate that models trained with EDNIL are empirically more robust against distributional shifts.
arXiv Detail & Related papers (2023-08-17T04:33:38Z) - Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - First Steps Toward Understanding the Extrapolation of Nonlinear Models
to Unseen Domains [35.76184529520015]
This paper makes some initial steps towards analyzing the extrapolation of nonlinear models for structured domain shift.
We prove that the family of nonlinear models of the form $f(x)=sum f_i(x_i)$, can extrapolate to unseen distributions.
arXiv Detail & Related papers (2022-11-21T18:41:19Z) - When Does Group Invariant Learning Survive Spurious Correlations? [29.750875769713513]
In this paper, we reveal the insufficiency of existing group invariant learning methods.
We propose two criteria on judging such sufficiency.
We show that existing methods can violate both criteria and thus fail in generalizing to spurious correlation shifts.
Motivated by this, we design a new group invariant learning method, which constructs groups with statistical independence tests.
arXiv Detail & Related papers (2022-06-29T11:16:11Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Domain segmentation and adjustment for generalized zero-shot learning [22.933463036413624]
In zero-shot learning, synthesizing unseen data with generative models has been the most popular method to address the imbalance of training data between seen and unseen classes.
We argue that synthesizing unseen data may not be an ideal approach for addressing the domain shift caused by the imbalance of the training data.
In this paper, we propose to realize the generalized zero-shot recognition in different domains.
arXiv Detail & Related papers (2020-02-01T15:00:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.