Causality is all you need
- URL: http://arxiv.org/abs/2311.12307v1
- Date: Tue, 21 Nov 2023 02:53:40 GMT
- Title: Causality is all you need
- Authors: Ning Xu, Yifei Gao, Hongshuo Tian, Yongdong Zhang, An-An Liu
- Abstract summary: Causal Graph Routing (CGR) is an integrated causal scheme relying entirely on the intervention mechanisms to reveal the cause-effect forces hidden in data.
CGR can surpass the current state-of-the-art methods on both Visual Question Answer and Long Document Classification tasks.
- Score: 63.10680366545293
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the fundamental statistics course, students are taught to remember the
well-known saying: "Correlation is not Causation". Till now, statistics (i.e.,
correlation) have developed various successful frameworks, such as Transformer
and Pre-training large-scale models, which have stacked multiple parallel
self-attention blocks to imitate a wide range of tasks. However, in the
causation community, how to build an integrated causal framework still remains
an untouched domain despite its excellent intervention capabilities. In this
paper, we propose the Causal Graph Routing (CGR) framework, an integrated
causal scheme relying entirely on the intervention mechanisms to reveal the
cause-effect forces hidden in data. Specifically, CGR is composed of a stack of
causal layers. Each layer includes a set of parallel deconfounding blocks from
different causal graphs. We combine these blocks via the concept of the
proposed sufficient cause, which allows the model to dynamically select the
suitable deconfounding methods in each layer. CGR is implemented as the stacked
networks, integrating no confounder, back-door adjustment, front-door
adjustment, and probability of sufficient cause. We evaluate this framework on
two classical tasks of CV and NLP. Experiments show CGR can surpass the current
state-of-the-art methods on both Visual Question Answer and Long Document
Classification tasks. In particular, CGR has great potential in building the
"causal" pre-training large-scale model that effectively generalizes to diverse
tasks. It will improve the machines' comprehension of causal relationships
within a broader semantic space.
Related papers
- Self-Clustering Hierarchical Multi-Agent Reinforcement Learning with Extensible Cooperation Graph [9.303181273699417]
This paper proposes a novel hierarchical MARL model called Hierarchical Cooperation Graph Learning (HCGL)
HCGL has three components: a dynamic Cooperation Graph (ECG) for achieving self-clustering cooperation; a group of graph operators for adjusting the topology of ECG; and an MARL for training these graph operators.
In our experiments, the HCGL model has shown outstanding performance in multi-agent benchmarks with sparse rewards.
arXiv Detail & Related papers (2024-03-26T19:19:16Z) - CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement
Learning [2.7446241148152253]
CORE is a reinforcement learning-based approach for causal discovery and intervention planning.
Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures.
CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency.
arXiv Detail & Related papers (2024-01-30T12:57:52Z) - PGODE: Towards High-quality System Dynamics Modeling [40.76121531452706]
This paper studies the problem of modeling multi-agent dynamical systems, where agents could interact mutually to influence their behaviors.
Recent research predominantly uses geometric graphs to depict these mutual interactions, which are then captured by graph neural networks (GNNs)
We propose a new approach named Prototypical Graph ODE to address the problem.
arXiv Detail & Related papers (2023-11-11T12:04:47Z) - Submodel Partitioning in Hierarchical Federated Learning: Algorithm
Design and Convergence Analysis [15.311309249848739]
Hierarchical learning (FL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL)
In this paper, we propose independent sub training overconstrained Internet of Things (IoT)
Key idea behind HIST is a global version of model computation, where we partition the global model into disjoint submodels in each round, and distribute them across different cells.
arXiv Detail & Related papers (2023-10-27T04:42:59Z) - Counterfactual Intervention Feature Transfer for Visible-Infrared Person
Re-identification [69.45543438974963]
We find graph-based methods in the visible-infrared person re-identification task (VI-ReID) suffer from bad generalization because of two issues.
The well-trained input features weaken the learning of graph topology, making it not generalized enough during the inference process.
We propose a Counterfactual Intervention Feature Transfer (CIFT) method to tackle these problems.
arXiv Detail & Related papers (2022-08-01T16:15:31Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Explainable Sparse Knowledge Graph Completion via High-order Graph
Reasoning Network [111.67744771462873]
This paper proposes a novel explainable model for sparse Knowledge Graphs (KGs)
It combines high-order reasoning into a graph convolutional network, namely HoGRN.
It can not only improve the generalization ability to mitigate the information insufficiency issue but also provide interpretability.
arXiv Detail & Related papers (2022-07-14T10:16:56Z) - Effect Identification in Cluster Causal Diagrams [51.42809552422494]
We introduce a new type of graphical model called cluster causal diagrams (for short, C-DAGs)
C-DAGs allow for the partial specification of relationships among variables based on limited prior knowledge.
We develop the foundations and machinery for valid causal inferences over C-DAGs.
arXiv Detail & Related papers (2022-02-22T21:27:31Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Neural Stochastic Block Model & Scalable Community-Based Graph Learning [8.00785050036369]
This paper proposes a scalable community-based neural framework for graph learning.
The framework learns the graph topology through the task of community detection and link prediction.
We look into two particular applications, the graph alignment and the anomalous correlation detection.
arXiv Detail & Related papers (2020-05-16T03:28:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.