Finding Alignments Between Interpretable Causal Variables and
Distributed Neural Representations
- URL: http://arxiv.org/abs/2303.02536v4
- Date: Wed, 21 Feb 2024 23:23:18 GMT
- Title: Finding Alignments Between Interpretable Causal Variables and
Distributed Neural Representations
- Authors: Atticus Geiger and Zhengxuan Wu and Christopher Potts and Thomas Icard
and Noah D. Goodman
- Abstract summary: Causal abstraction is a promising theoretical framework for explainable artificial intelligence.
Existing causal abstraction methods require a brute-force search over alignments between the high-level model and the low-level one.
We present distributed alignment search (DAS), which overcomes these limitations.
- Score: 62.65877150123775
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Causal abstraction is a promising theoretical framework for explainable
artificial intelligence that defines when an interpretable high-level causal
model is a faithful simplification of a low-level deep learning system.
However, existing causal abstraction methods have two major limitations: they
require a brute-force search over alignments between the high-level model and
the low-level one, and they presuppose that variables in the high-level model
will align with disjoint sets of neurons in the low-level one. In this paper,
we present distributed alignment search (DAS), which overcomes these
limitations. In DAS, we find the alignment between high-level and low-level
models using gradient descent rather than conducting a brute-force search, and
we allow individual neurons to play multiple distinct roles by analyzing
representations in non-standard bases-distributed representations. Our
experiments show that DAS can discover internal structure that prior approaches
miss. Overall, DAS removes previous obstacles to conducting causal abstraction
analyses and allows us to find conceptual structure in trained neural nets.
Related papers
- Learning Causal Abstractions of Linear Structural Causal Models [18.132607344833925]
Causal Abstraction provides a framework for formalizing two Structural Causal Models at different levels of detail.
We tackle both issues for linear causal models with linear abstraction functions.
In particular, we introduce Abs-LiNGAM, a method that leverages the constraints induced by the learned high-level model and the abstraction function to speedup the recovery of the larger low-level model.
arXiv Detail & Related papers (2024-06-01T10:42:52Z) - The twin peaks of learning neural networks [3.382017614888546]
Recent works demonstrated the existence of a double-descent phenomenon for the generalization error of neural networks.
We explore a link between this phenomenon and the increase of complexity and sensitivity of the function represented by neural networks.
arXiv Detail & Related papers (2024-01-23T10:09:14Z) - Causal Triplet: An Open Challenge for Intervention-centric Causal
Representation Learning [98.78136504619539]
Causal Triplet is a causal representation learning benchmark featuring visually more complex scenes.
We show that models built with the knowledge of disentangled or object-centric representations significantly outperform their distributed counterparts.
arXiv Detail & Related papers (2023-01-12T17:43:38Z) - Unifying Causal Inference and Reinforcement Learning using Higher-Order
Category Theory [4.119151469153588]
We present a unified formalism for structure discovery of causal models and predictive state representation models in reinforcement learning.
Specifically, we model structure discovery in both settings using simplicial objects.
arXiv Detail & Related papers (2022-09-13T19:04:18Z) - Systematic Evaluation of Causal Discovery in Visual Model Based
Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure.
Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs.
In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z) - ACRE: Abstract Causal REasoning Beyond Covariation [90.99059920286484]
We introduce the Abstract Causal REasoning dataset for systematic evaluation of current vision systems in causal induction.
Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario.
We notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning.
arXiv Detail & Related papers (2021-03-26T02:42:38Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Decontextualized learning for interpretable hierarchical representations
of visual patterns [0.0]
We present an algorithm and training paradigm designed specifically to address this: decontextualized hierarchical representation learning (DHRL)
DHRL address the limitations of small datasets and encourages a disentangled set of hierarchically organized features.
In addition to providing a tractable path for analyzing complex hierarchal patterns using variation inference, this approach is generative and can be directly combined with empirical and theoretical approaches.
arXiv Detail & Related papers (2020-08-31T14:47:55Z) - Local Propagation in Constraint-based Neural Network [77.37829055999238]
We study a constraint-based representation of neural network architectures.
We investigate a simple optimization procedure that is well suited to fulfil the so-called architectural constraints.
arXiv Detail & Related papers (2020-02-18T16:47:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.