Related papers: Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization

Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization

URL: http://arxiv.org/abs/2206.07837v4
Date: Fri, 17 May 2024 22:36:13 GMT
Title: Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
Authors: Jivat Neet Kaur, Emre Kiciman, Amit Sharma,
Abstract summary: Real-world data often has multiple distribution shifts over different attributes. No state-of-the-art DG algorithm performs consistently well on all shifts. We develop Causally Adaptive Constraint Minimization (CACM), an algorithm that uses knowledge about the data-generating process to adaptively identify and apply the correct independence constraints for regularization.
Score: 23.302060306322506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent empirical studies on domain generalization (DG) have shown that DG algorithms that perform well on some distribution shifts fail on others, and no state-of-the-art DG algorithm performs consistently well on all shifts. Moreover, real-world data often has multiple distribution shifts over different attributes; hence we introduce multi-attribute distribution shift datasets and find that the accuracy of existing DG algorithms falls even further. To explain these results, we provide a formal characterization of generalization under multi-attribute shifts using a canonical causal graph. Based on the relationship between spurious attributes and the classification label, we obtain realizations of the canonical causal graph that characterize common distribution shifts and show that each shift entails different independence constraints over observed variables. As a result, we prove that any algorithm based on a single, fixed constraint cannot work well across all shifts, providing theoretical evidence for mixed empirical results on DG algorithms. Based on this insight, we develop Causally Adaptive Constraint Minimization (CACM), an algorithm that uses knowledge about the data-generating process to adaptively identify and apply the correct independence constraints for regularization. Results on fully synthetic, MNIST, small NORB, and Waterbirds datasets, covering binary and multi-valued attributes and labels, show that adaptive dataset-dependent constraints lead to the highest accuracy on unseen domains whereas incorrect constraints fail to do so. Our results demonstrate the importance of modeling the causal relationships inherent in the data-generating process.

Related papers

Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time [60.341117019125214]
We propose a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns in graph anomaly detection (GAD)<n>To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level.<n>Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.
arXiv Detail & Related papers (2025-11-10T12:10:05Z)
Generative Risk Minimization for Out-of-Distribution Generalization on Graphs [71.48583448654522]
We propose an innovative framework, named Generative Risk Minimization (GRM), designed to generate an invariant subgraph for each input graph to be classified, instead of extraction. We conduct extensive experiments across a variety of real-world graph datasets for both node-level and graph-level OOD generalization.
arXiv Detail & Related papers (2025-02-11T21:24:13Z)
Knowledge Distillation and Enhanced Subdomain Adaptation Using Graph Convolutional Network for Resource-Constrained Bearing Fault Diagnosis [0.0]
We propose a progressive knowledge distillation framework that transfers knowledge from a complex teacher model to a compact and efficient student model.<n>We introduce Enhanced Local Maximum Mean Squared Discrepancy (ELMMSD), which leverages mean and variance statistics in the Reproducing Kernel Hilbert Space (RKHS) and incorporates a priori probability distributions between labels.
arXiv Detail & Related papers (2025-01-13T10:05:47Z)
DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification [14.96980804513399]
Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains. Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process. We introduce a more realistic graph data generation model using Structural Causal Models (SCMs) We propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings.
arXiv Detail & Related papers (2024-10-27T00:22:18Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly. equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset. Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z)
Graphs Generalization under Distribution Shifts [11.963958151023732]
We introduce a novel framework, namely Graph Learning Invariant Domain genERation (GLIDER) Our model outperforms baseline methods on node-level OOD generalization across domains in distribution shift on node features and topological structures simultaneously.
arXiv Detail & Related papers (2024-03-25T00:15:34Z)
Score-based Causal Representation Learning: Linear and General Transformations [31.786444957887472]
The paper addresses both the identifiability and achievability aspects. It designs a score-based class of algorithms that ensures both identifiability and achievability. Results are validated via experiments on structured synthetic data and image data.
arXiv Detail & Related papers (2024-02-01T18:40:03Z)
iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models [48.33685559041322]
This paper focuses on identifying the causal mechanism shifts in two or more related datasets over the same set of variables. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/iSCAN.
arXiv Detail & Related papers (2023-06-30T01:48:11Z)
Effect Identification in Cluster Causal Diagrams [51.42809552422494]
We introduce a new type of graphical model called cluster causal diagrams (for short, C-DAGs) C-DAGs allow for the partial specification of relationships among variables based on limited prior knowledge. We develop the foundations and machinery for valid causal inferences over C-DAGs.
arXiv Detail & Related papers (2022-02-22T21:27:31Z)
Partial Counterfactual Identification from Observational and Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data. Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z)
Instrumental Variable-Driven Domain Generalization with Unobserved Confounders [53.735614014067394]
Domain generalization (DG) aims to learn from multiple source domains a model that can generalize well on unseen target domains. We propose an instrumental variable-driven DG method (IV-DG) by removing the bias of the unobserved confounders with two-stage learning. In the first stage, it learns the conditional distribution of the input features of one domain given input features of another domain. In the second stage, it estimates the relationship by predicting labels with the learned conditional distribution.
arXiv Detail & Related papers (2021-10-04T13:32:57Z)
OoD-Bench: Benchmarking and Understanding Out-of-Distribution Generalization Datasets and Algorithms [28.37021464780398]
We show that existing OoD algorithms that outperform empirical risk minimization on one distribution shift usually have limitations on the other distribution shift. The new benchmark may serve as a strong foothold that can be resorted to by future OoD generalization research.
arXiv Detail & Related papers (2021-06-07T15:34:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.