Accounting for Unobserved Confounding in Domain Generalization
- URL: http://arxiv.org/abs/2007.10653v6
- Date: Thu, 3 Feb 2022 15:26:42 GMT
- Title: Accounting for Unobserved Confounding in Domain Generalization
- Authors: Alexis Bellot and Mihaela van der Schaar
- Abstract summary: This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
- Score: 107.0464488046289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the problem of learning robust, generalizable
prediction models from a combination of multiple datasets and qualitative
assumptions about the underlying data-generating model. Part of the challenge
of learning robust models lies in the influence of unobserved confounders that
void many of the invariances and principles of minimum error presently used for
this problem. Our approach is to define a different invariance property of
causal solutions in the presence of unobserved confounders which, through a
relaxation of this invariance, can be connected with an explicit
distributionally robust optimization problem over a set of affine combination
of data distributions. Concretely, our objective takes the form of a standard
loss, plus a regularization term that encourages partial equality of error
derivatives with respect to model parameters. We demonstrate the empirical
performance of our approach on healthcare data from different modalities,
including image, speech and tabular data.
Related papers
- The Benefits of Balance: From Information Projections to Variance Reduction [7.082773426322819]
We show that an iterative algorithm, usually used to avoid representation collapse, enjoys an unsuspected benefit.
We provide non-asymptotic bounds quantifying this variance reduction effect and relate them to the eigendecays of appropriately defined Markov operators.
We explain how various forms of data balancing in contrastive multimodal learning and self-supervised clustering can be interpreted as instances of this variance reduction scheme.
arXiv Detail & Related papers (2024-08-27T13:48:15Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - It's All in the Mix: Wasserstein Machine Learning with Mixed Features [5.739657897440173]
We present a practically efficient algorithm to solve mixed-feature problems.
We demonstrate that our approach can significantly outperform existing methods that are to the presence of discrete features.
arXiv Detail & Related papers (2023-12-19T15:15:52Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Consistent Explanations in the Face of Model Indeterminacy via
Ensembling [12.661530681518899]
This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy.
We introduce ensemble methods to enhance the consistency of the explanations provided in these scenarios.
Our findings highlight the importance of considering model indeterminacy when interpreting explanations.
arXiv Detail & Related papers (2023-06-09T18:45:43Z) - Learning Invariant Representations under General Interventions on the
Response [2.725698729450241]
We focus on linear structural causal models (SCMs) and introduce invariant matching property (IMP)
We analyze the generalization errors of our method under both the discrete and continuous environment settings.
arXiv Detail & Related papers (2022-08-22T03:09:17Z) - Learning from few examples with nonlinear feature maps [68.8204255655161]
We explore the phenomenon and reveal key relationships between dimensionality of AI model's feature space, non-degeneracy of data distributions, and the model's generalisation capabilities.
The main thrust of our present analysis is on the influence of nonlinear feature transformations mapping original data into higher- and possibly infinite-dimensional spaces on the resulting model's generalisation capabilities.
arXiv Detail & Related papers (2022-03-31T10:36:50Z) - Discriminative Multimodal Learning via Conditional Priors in Generative
Models [21.166519800652047]
This research studies the realistic scenario in which all modalities and class labels are available for model training.
We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities.
arXiv Detail & Related papers (2021-10-09T17:22:24Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.