On the Tractability of Neural Causal Inference
- URL: http://arxiv.org/abs/2110.12052v1
- Date: Fri, 22 Oct 2021 20:38:01 GMT
- Title: On the Tractability of Neural Causal Inference
- Authors: Matej Ze\v{c}evi\'c and Devendra Singh Dhami and Kristian Kersting
- Abstract summary: sum-product network (SPN) offers linear time complexity.
neural causal models (NCM) recently gained traction, demanding a tighter integration of causality for machine learning.
We prove that SPN-based causal inference is generally tractable, opposed to standard-based NCM.
- Score: 19.417231973682366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Roth (1996) proved that any form of marginal inference with probabilistic
graphical models (e.g. Bayesian Networks) will at least be NP-hard. Introduced
and extensively investigated in the past decade, the neural probabilistic
circuits known as sum-product network (SPN) offers linear time complexity. On
another note, research around neural causal models (NCM) recently gained
traction, demanding a tighter integration of causality for machine learning. To
this end, we present a theoretical investigation of if, when, how and under
what cost tractability occurs for different NCM. We prove that SPN-based causal
inference is generally tractable, opposed to standard MLP-based NCM. We further
introduce a new tractable NCM-class that is efficient in inference and fully
expressive in terms of Pearl's Causal Hierarchy. Our comparative empirical
illustration on simulations and standard benchmarks validates our theoretical
proofs.
Related papers
- Consistency of Neural Causal Partial Identification [17.503562318576414]
We show consistency of partial identification via Neural Causal Models (NCMs) in a general setting with both continuous and categorical variables.
Results highlight the impact of the design of the underlying neural network architecture in terms of depth and connectivity.
We provide a counterexample showing that without Lipschitz regularization the NCM may not be consistent.
arXiv Detail & Related papers (2024-05-24T16:12:39Z) - Neural Networks Asymptotic Behaviours for the Resolution of Inverse
Problems [0.0]
This paper presents a study of the effectiveness of Neural Network (NN) techniques for deconvolution inverse problems.
We consider NNs limits, corresponding to Gaussian Processes (GPs), where non-linearities in the parameters of the NN can be neglected.
We address the deconvolution inverse problem in the case of a quantum harmonic oscillator simulated through Monte Carlo techniques on a lattice.
arXiv Detail & Related papers (2024-02-14T17:42:24Z) - Towards Demystifying the Generalization Behaviors When Neural Collapse
Emerges [132.62934175555145]
Neural Collapse (NC) is a well-known phenomenon of deep neural networks in the terminal phase of training (TPT)
We propose a theoretical explanation for why continuing training can still lead to accuracy improvement on test set, even after the train accuracy has reached 100%.
We refer to this newly discovered property as "non-conservative generalization"
arXiv Detail & Related papers (2023-10-12T14:29:02Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - Uniform Generalization Bounds for Overparameterized Neural Networks [5.945320097465419]
We prove uniform generalization bounds for overparameterized neural networks in kernel regimes.
Our bounds capture the exact error rates depending on the differentiability of the activation functions.
We show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Mat'ern family of kernels.
arXiv Detail & Related papers (2021-09-13T16:20:13Z) - Relating Graph Neural Networks to Structural Causal Models [17.276657786213015]
Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations.
We present a theoretical analysis that establishes a novel connection between GNN and SCM.
We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification.
arXiv Detail & Related papers (2021-09-09T11:16:31Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Regularizing Recurrent Neural Networks via Sequence Mixup [7.036759195546171]
We extend a class of celebrated regularization techniques originally proposed for feed-forward neural networks.
Our proposed methods are easy to implement complexity, while leverage the performance of simple neural architectures.
arXiv Detail & Related papers (2020-11-27T05:43:40Z) - A Chain Graph Interpretation of Real-World Neural Networks [58.78692706974121]
We propose an alternative interpretation that identifies NNs as chain graphs (CGs) and feed-forward as an approximate inference procedure.
The CG interpretation specifies the nature of each NN component within the rich theoretical framework of probabilistic graphical models.
We demonstrate with concrete examples that the CG interpretation can provide novel theoretical support and insights for various NN techniques.
arXiv Detail & Related papers (2020-06-30T14:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.