Related papers: DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification

DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification

URL: http://arxiv.org/abs/2410.20295v1
Date: Sun, 27 Oct 2024 00:22:18 GMT
Title: DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification
Authors: Xiaoxue Han, Huzefa Rangwala, Yue Ning,
Abstract summary: Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains. Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process. We introduce a more realistic graph data generation model using Structural Causal Models (SCMs) We propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings.
Score: 14.96980804513399
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains. There is a pressing need to enhance the generalizability of GNNs on out-of-distribution (OOD) test data. Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process, which do not adequately reflect the actual dynamics of distribution shifts in graphs. In this paper, we introduce a more realistic graph data generation model using Structural Causal Models (SCMs), allowing us to redefine distribution shifts by pinpointing their origins within the generation process. Building on this, we propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings. We provide a detailed theoretical framework that shows how our approach can effectively mitigate the impact of various distribution shifts. We evaluate DeCaf across both real-world and synthetic datasets that demonstrate different patterns of shifts, confirming its efficacy in enhancing the generalizability of GNNs.

Related papers

Causal invariant geographic network representations with feature and structural distribution shifts [5.237838679495733]
Methods learn geographic network representations through deep graph neural networks (GNNs) based on the i.i.d. assumption. We propose a feature-structure mixed invariant representation learning (FSM-IRL) model that accounts for both feature distribution shifts and structural distribution shifts. Experiments demonstrate that FSM-IRL exhibits strong learning capabilities on both geographic and social network datasets in OOD scenarios.
arXiv Detail & Related papers (2025-03-25T06:21:57Z)
AdaRC: Mitigating Graph Structure Shifts during Test-Time [66.40525136929398]
Test-time adaptation (TTA) has attracted attention due to its ability to adapt a pre-trained model to a target domain without re-accessing the source domain. We propose AdaRC, an innovative framework designed for effective and efficient adaptation to structure shifts in graphs.
arXiv Detail & Related papers (2024-10-09T15:15:40Z)
xAI-Drop: Don't Use What You Cannot Explain [23.33477769275026]
Graph Neural Networks (GNNs) have emerged as the predominant paradigm for learning from graph-structured data. GNNs face challenges such as lack of generalization and poor interpretability. We introduce xAI-Drop, a novel topological-level dropping regularizer.
arXiv Detail & Related papers (2024-07-29T14:53:45Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
Learning Invariant Representations of Graph Neural Networks via Cluster Generalization [58.68231635082891]
Graph neural networks (GNNs) have become increasingly popular in modeling graph-structured data. In this paper, we experimentally find that the performance of GNNs drops significantly when the structure shift happens. We propose the Cluster Information Transfer (CIT) mechanism, which can learn invariant representations for GNNs.
arXiv Detail & Related papers (2024-03-06T10:36:56Z)
Graph Out-of-Distribution Generalization via Causal Intervention [69.70137479660113]
We introduce a conceptually simple yet principled approach for training robust graph neural networks (GNNs) under node-level distribution shifts. Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor. Our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks.
arXiv Detail & Related papers (2024-02-18T07:49:22Z)
Boosted Control Functions: Distribution generalization and invariance in confounded models [10.503777692702952]
We introduce a strong notion of invariance that allows for distribution generalization even in the presence of nonlinear, non-identifiable structural functions. We propose the ControlTwicing algorithm to estimate the Boosted Control Function (BCF) using flexible machine-learning techniques.
arXiv Detail & Related papers (2023-10-09T15:43:46Z)
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation [6.307753856507624]
Graph Neural Networks (GNNs) have received increasing attention due to their ability to learn from graph-structured data. Post-hoc instance-level explanation methods have been proposed to understand GNN predictions. We shed light on the existence of the distribution shifting issue in existing methods, which affects explanation quality.
arXiv Detail & Related papers (2023-07-15T15:46:38Z)
iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models [48.33685559041322]
This paper focuses on identifying the causal mechanism shifts in two or more related datasets over the same set of variables. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/iSCAN.
arXiv Detail & Related papers (2023-06-30T01:48:11Z)
Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe. GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z)
Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization [23.302060306322506]
Real-world data often has multiple distribution shifts over different attributes. No state-of-the-art DG algorithm performs consistently well on all shifts. We develop Causally Adaptive Constraint Minimization (CACM), an algorithm that uses knowledge about the data-generating process to adaptively identify and apply the correct independence constraints for regularization.
arXiv Detail & Related papers (2022-06-15T22:35:06Z)
Handling Distribution Shifts on Graphs: An Invariance Perspective [78.31180235269035]
We formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM) EERM resorts to multiple context explorers that are adversarially trained to maximize the variance of risks from multiple virtual environments. We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z)
Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts. Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet. We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.