Asymptotic Causal Inference
- URL: http://arxiv.org/abs/2109.09653v1
- Date: Mon, 20 Sep 2021 16:16:00 GMT
- Title: Asymptotic Causal Inference
- Authors: Sridhar Mahadevan
- Abstract summary: We investigate causal inference in the regime as the number of variables approaches infinity using an information-theoretic framework.
We define structural entropy of a causal model in terms of its description complexity measured by the logarithmic growth rate.
We generalize a recently popular bipartite experimental design for studying causal inference on large datasets.
- Score: 6.489113969363787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate causal inference in the asymptotic regime as the number of
variables approaches infinity using an information-theoretic framework. We
define structural entropy of a causal model in terms of its description
complexity measured by the logarithmic growth rate, measured in bits, of all
directed acyclic graphs (DAGs), parameterized by the edge density d. Structural
entropy yields non-intuitive predictions. If we randomly sample a DAG from the
space of all models, in the range d = (0, 1/8), almost surely the model is a
two-layer DAG! Semantic entropy quantifies the reduction in entropy where edges
are removed by causal intervention. Semantic causal entropy is defined as the
f-divergence between the observational distribution and the interventional
distribution P', where a subset S of edges are intervened on to determine their
causal influence. We compare the decomposability properties of semantic entropy
for different choices of f-divergences, including KL-divergence, squared
Hellinger distance, and total variation distance. We apply our framework to
generalize a recently popular bipartite experimental design for studying causal
inference on large datasets, where interventions are carried out on one set of
variables (e.g., power plants, items in an online store), but outcomes are
measured on a disjoint set of variables (residents near power plants, or
shoppers). We generalize bipartite designs to k-partite designs, and describe
an optimization framework for finding the optimal k-level DAG architecture for
any value of d \in (0, 1/2). As edge density increases, a sequence of phase
transitions occur over disjoint intervals of d, with deeper DAG architectures
emerging for larger values of d. We also give a quantitative bound on the
number of samples needed to reliably test for average causal influence for a
k-partite design.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - KPZ scaling from the Krylov space [83.88591755871734]
Recently, a superdiffusion exhibiting the Kardar-Parisi-Zhang scaling in late-time correlators and autocorrelators has been reported.
Inspired by these results, we explore the KPZ scaling in correlation functions using their realization in the Krylov operator basis.
arXiv Detail & Related papers (2024-06-04T20:57:59Z) - Convergence and concentration properties of constant step-size SGD
through Markov chains [0.0]
We consider the optimization of a smooth and strongly convex objective using constant step-size gradient descent (SGD)
We show that, for unbiased gradient estimates with mildly controlled variance, the iteration converges to an invariant distribution in total variation distance.
All our results are non-asymptotic and their consequences are discussed through a few applications.
arXiv Detail & Related papers (2023-06-20T12:36:28Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - High-dimensional limit theorems for SGD: Effective dynamics and critical
scaling [6.950316788263433]
We prove limit theorems for the trajectories of summary statistics of gradient descent (SGD)
We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss.
About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate.
arXiv Detail & Related papers (2022-06-08T17:42:18Z) - Optimal 1-Wasserstein Distance for WGANs [2.1174215880331775]
We provide a thorough analysis of Wasserstein GANs (WGANs) in both the finite sample and regimes.
We derive in passing new results on optimal transport theory in the semi-discrete setting.
arXiv Detail & Related papers (2022-01-08T13:04:03Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Variational Causal Networks: Approximate Bayesian Inference over Causal
Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs.
In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z) - SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize
Criticality [15.640534097470923]
We propose a new framework for analyzing the dynamics of gradient descent (SGD) when both number of samples and dimensions are large.
Using this new framework, we show that the dynamics of SGD on a least squares problem with random data become deterministic in the large sample and dimensional limit.
arXiv Detail & Related papers (2021-02-08T18:00:13Z) - Entropic Causal Inference: Identifiability and Finite Sample Results [14.495984877053948]
Entropic causal inference is a framework for inferring the causal direction between two categorical variables from observational data.
We consider the minimum entropy coupling-based algorithmic approach presented by Kocaoglu et al.
arXiv Detail & Related papers (2021-01-10T08:37:54Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.