A Shift in Perspective on Causality in Domain Generalization
- URL: http://arxiv.org/abs/2508.12798v1
- Date: Mon, 18 Aug 2025 10:19:33 GMT
- Title: A Shift in Perspective on Causality in Domain Generalization
- Authors: Damian Machlanski, Stephanie Riley, Edward Moroshko, Kurt Butler, Panagiotis Dimitrakopoulos, Thomas Melistas, Akchunya Chanchal, Steven McDonagh, Ricardo Silva, Sotirios A. Tsaftaris,
- Abstract summary: We revisit the claims of the causality and DG literature.<n>We argue for a more nuanced theory of the role of causality in generalization.<n>We also provide an interactive demo at https://chai-uk.github.io/ukairs25-causal-predictors/.
- Score: 16.172002413067396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The promise that causal modelling can lead to robust AI generalization has been challenged in recent work on domain generalization (DG) benchmarks. We revisit the claims of the causality and DG literature, reconciling apparent contradictions and advocating for a more nuanced theory of the role of causality in generalization. We also provide an interactive demo at https://chai-uk.github.io/ukairs25-causal-predictors/.
Related papers
- Contrastive Weak-to-strong Generalization [50.5986177336082]
We propose Contrastive Weak-to-Strong Generalization (ConG) to advance weak-to-strong generalization.<n>This framework employs contrastive decoding between pre- and post-alignment weak models to generate higher-quality samples.
arXiv Detail & Related papers (2025-10-09T07:37:23Z) - Fair Deepfake Detectors Can Generalize [51.21167546843708]
We show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions.<n>Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals.<n>DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art
arXiv Detail & Related papers (2025-07-03T14:10:02Z) - A Flat Minima Perspective on Understanding Augmentations and Model Robustness [4.297070083645049]
We offer a unified theoretical framework to clarify how augmentations can enhance model robustness.<n>Our work diverges from prior studies in that our analysis broadly encompasses much of the existing augmentation methods.<n>We confirm our theories through simulations on the existing common corruption and adversarial robustness benchmarks.
arXiv Detail & Related papers (2025-05-30T13:40:44Z) - Generalizing across Temporal Domains with Koopman Operators [15.839454056986446]
In this study, we contribute novel theoretic results that align conditional distribution leads to the reduction of generalization bounds.
Our analysis serves as a key motivation for solving the Temporal Domain Generalization (TDG) problem through the application of Koopman Neural Operators.
arXiv Detail & Related papers (2024-02-12T17:45:40Z) - Improving Generalization with Domain Convex Game [32.07275105040802]
Domain generalization tends to alleviate the poor generalization capability of deep neural networks by learning model with multiple source domains.
A classical solution to DG is domain augmentation, the common belief of which is that diversifying source domains will be conducive to the out-of-distribution generalization.
Our explorations reveal that the correlation between model generalization and the diversity of domains may be not strictly positive, which limits the effectiveness of domain augmentation.
arXiv Detail & Related papers (2023-03-23T14:27:49Z) - Domain Generalization -- A Causal Perspective [20.630396283221838]
Machine learning models have gained widespread success, from healthcare to personalized recommendations.
One of the preliminary assumptions of these models is the independent and identical distribution.
Since the models rely heavily on this assumption, they exhibit poor generalization capabilities.
arXiv Detail & Related papers (2022-09-30T01:56:49Z) - Causal Inference Principles for Reasoning about Commonsense Causality [93.19149325083968]
Commonsense causality reasoning aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person.
Existing work usually relies on deep language models wholeheartedly, and is potentially susceptible to confounding co-occurrences.
Motivated by classical causal principles, we articulate the central question of CCR and draw parallels between human subjects in observational studies and natural languages.
We propose a novel framework, ROCK, to Reason O(A)bout Commonsense K(C)ausality, which utilizes temporal signals as incidental supervision.
arXiv Detail & Related papers (2022-01-31T06:12:39Z) - Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data.
We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG)
Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z) - Contrastive Syn-to-Real Generalization [125.54991489017854]
We make a key observation that the diversity of the learned feature embeddings plays an important role in the generalization performance.
We propose contrastive synthetic-to-real generalization (CSG), a novel framework that leverages the pre-trained ImageNet knowledge to prevent overfitting to the synthetic domain.
We demonstrate the effectiveness of CSG on various synthetic training tasks, exhibiting state-of-the-art performance on zero-shot domain generalization.
arXiv Detail & Related papers (2021-04-06T05:10:29Z) - In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, most of these bounds are numerically vacuous.
We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z) - Detecting and Understanding Generalization Barriers for Neural Machine
Translation [53.23463279153577]
This paper attempts to identify and understand generalization barrier words within an unseen input sentence.
We propose a principled definition of generalization barrier words and a modified version which is tractable in computation.
We then conduct extensive analyses on those detected generalization barrier words on both Zh$Leftrightarrow$En NIST benchmarks.
arXiv Detail & Related papers (2020-04-05T12:33:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.