Causal Abstractions of Neural Networks
- URL: http://arxiv.org/abs/2106.02997v1
- Date: Sun, 6 Jun 2021 01:07:43 GMT
- Title: Causal Abstractions of Neural Networks
- Authors: Atticus Geiger, Hanson Lu, Thomas Icard, Christopher Potts
- Abstract summary: We propose a new structural analysis method grounded in a formal theory of textitcausal abstraction.
We apply this method to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus.
- Score: 9.291492712301569
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structural analysis methods (e.g., probing and feature attribution) are
increasingly important tools for neural network analysis. We propose a new
structural analysis method grounded in a formal theory of \textit{causal
abstraction} that provides rich characterizations of model-internal
representations and their roles in input/output behavior. In this method,
neural representations are aligned with variables in interpretable causal
models, and then \textit{interchange interventions} are used to experimentally
verify that the neural representations have the causal properties of their
aligned variables. We apply this method in a case study to analyze neural
models trained on Multiply Quantified Natural Language Inference (MQNLI)
corpus, a highly complex NLI dataset that was constructed with a
tree-structured natural logic causal model. We discover that a BERT-based model
with state-of-the-art performance successfully realizes the approximate causal
structure of the natural logic causal model, whereas a simpler baseline model
fails to show any such structure, demonstrating that neural representations
encode the compositional structure of MQNLI examples.
Related papers
- Structure of Artificial Neural Networks -- Empirical Investigations [0.0]
Within one decade, Deep Learning overtook the dominating solution methods of countless problems of artificial intelligence.
With a formal definition for structures of neural networks, neural architecture search problems and solution methods can be formulated under a common framework.
Does structure make a difference or can it be chosen arbitrarily?
arXiv Detail & Related papers (2024-10-12T16:13:28Z) - Hidden Holes: topological aspects of language models [1.1172147007388977]
We study the evolution of topological structure in GPT based large language models across depth and time during training.
We show that the latter exhibit more topological complexity, with a distinct pattern of changes common to all natural languages but absent from synthetically generated data.
arXiv Detail & Related papers (2024-06-09T14:25:09Z) - Consistency of Neural Causal Partial Identification [17.503562318576414]
We show consistency of partial identification via Neural Causal Models (NCMs) in a general setting with both continuous and categorical variables.
Results highlight the impact of the design of the underlying neural network architecture in terms of depth and connectivity.
We provide a counterexample showing that without Lipschitz regularization the NCM may not be consistent.
arXiv Detail & Related papers (2024-05-24T16:12:39Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models.
We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - A Physics-Guided Neural Operator Learning Approach to Model Biological
Tissues from Digital Image Correlation Measurements [3.65211252467094]
We present a data-driven correlation to biological tissue modeling, which aims to predict the displacement field based on digital image correlation (DIC) measurements under unseen loading scenarios.
A material database is constructed from the DIC displacement tracking measurements of multiple biaxial stretching protocols on a porcine tricuspid valve leaflet.
The material response is modeled as a solution operator from the loading to the resultant displacement field, with the material properties learned implicitly from the data and naturally embedded in the network parameters.
arXiv Detail & Related papers (2022-04-01T04:56:41Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Exploring End-to-End Differentiable Natural Logic Modeling [21.994060519995855]
We explore end-to-end trained differentiable models that integrate natural logic with neural networks.
The proposed model adapts module networks to model natural logic operations, which is enhanced with a memory component to model contextual information.
arXiv Detail & Related papers (2020-11-08T18:18:15Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.