Causal Deep Learning: Causal Capsules and Tensor Transformers
- URL: http://arxiv.org/abs/2301.00314v1
- Date: Sun, 1 Jan 2023 00:47:03 GMT
- Title: Causal Deep Learning: Causal Capsules and Tensor Transformers
- Authors: M. Alex O. Vasilescu
- Abstract summary: Inverse causal questions are addressed with a neural network that implements multilinear projection and estimates the causes of effects.
Our forward and inverse neural network architectures are suitable for asynchronous parallel computation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We derive a set of causal deep neural networks whose architectures are a
consequence of tensor (multilinear) factor analysis. Forward causal questions
are addressed with a neural network architecture composed of causal capsules
and a tensor transformer. The former estimate a set of latent variables that
represent the causal factors, and the latter governs their interaction. Causal
capsules and tensor transformers may be implemented using shallow autoencoders,
but for a scalable architecture we employ block algebra and derive a deep
neural network composed of a hierarchy of autoencoders. An interleaved kernel
hierarchy preprocesses the data resulting in a hierarchy of kernel tensor
factor models. Inverse causal questions are addressed with a neural network
that implements multilinear projection and estimates the causes of effects. As
an alternative to aggressive bottleneck dimension reduction or regularized
regression that may camouflage an inherently underdetermined inverse problem,
we prescribe modeling different aspects of the mechanism of data formation with
piecewise tensor models whose multilinear projections are well-defined and
produce multiple candidate solutions. Our forward and inverse neural network
architectures are suitable for asynchronous parallel computation.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - A Spectral Theory of Neural Prediction and Alignment [8.65717258105897]
We use a recent theoretical framework that relates the generalization error from regression to the spectral properties of the model and the target.
We test a large number of deep neural networks that predict visual cortical activity and show that there are multiple types of geometries that result in low neural prediction error as measured via regression.
arXiv Detail & Related papers (2023-09-22T12:24:06Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems.
The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients.
Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z) - Modeling Structure with Undirected Neural Networks [20.506232306308977]
We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order.
We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
arXiv Detail & Related papers (2022-02-08T10:06:51Z) - A Sparse Coding Interpretation of Neural Networks and Theoretical
Implications [0.0]
Deep convolutional neural networks have achieved unprecedented performance in various computer vision tasks.
We propose a sparse coding interpretation of neural networks that have ReLU activation.
We derive a complete convolutional neural network without normalization and pooling.
arXiv Detail & Related papers (2021-08-14T21:54:47Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Non-asymptotic Excess Risk Bounds for Classification with Deep
Convolutional Neural Networks [6.051520664893158]
We consider the problem of binary classification with a class of general deep convolutional neural networks.
We define the prefactors of the risk bounds in terms of the input data dimension and other model parameters.
We show that the classification methods with CNNs can circumvent the curse of dimensionality.
arXiv Detail & Related papers (2021-05-01T15:55:04Z) - Going beyond p-convolutions to learn grayscale morphological operators [64.38361575778237]
We present two new morphological layers based on the same principle as the p-convolutional layer.
In this work, we present two new morphological layers based on the same principle as the p-convolutional layer.
arXiv Detail & Related papers (2021-02-19T17:22:16Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.