Causal inference through multi-stage learning and doubly robust deep neural networks
- URL: http://arxiv.org/abs/2407.08560v1
- Date: Thu, 11 Jul 2024 14:47:44 GMT
- Title: Causal inference through multi-stage learning and doubly robust deep neural networks
- Authors: Yuqian Zhang, Jelena Bradic,
- Abstract summary: Deep neural networks (DNNs) have demonstrated remarkable empirical performance in large-scale supervised learning problems.
This study delves into the application of DNNs across a wide spectrum of intricate causal inference tasks.
- Score: 10.021381302215062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have demonstrated remarkable empirical performance in large-scale supervised learning problems, particularly in scenarios where both the sample size $n$ and the dimension of covariates $p$ are large. This study delves into the application of DNNs across a wide spectrum of intricate causal inference tasks, where direct estimation falls short and necessitates multi-stage learning. Examples include estimating the conditional average treatment effect and dynamic treatment effect. In this framework, DNNs are constructed sequentially, with subsequent stages building upon preceding ones. To mitigate the impact of estimation errors from early stages on subsequent ones, we integrate DNNs in a doubly robust manner. In contrast to previous research, our study offers theoretical assurances regarding the effectiveness of DNNs in settings where the dimensionality $p$ expands with the sample size. These findings are significant independently and extend to degenerate single-stage learning problems.
Related papers
- Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples [53.95282502030541]
Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples.
We try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view.
arXiv Detail & Related papers (2024-06-06T10:38:01Z) - What Variables Affect Out-of-Distribution Generalization in Pretrained Models? [15.047920317548128]
Embeddings produced by pre-trained deep neural networks (DNNs) are widely used, but their efficacy for downstream tasks can vary widely.
We study the factors influencing transferability and out-of-distribution generalization of pre-trained DNN embeddings.
arXiv Detail & Related papers (2024-05-23T19:43:45Z) - Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions.
In this paper, we discover the DNN learns interactions in two phases.
The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z) - Unveiling and Mitigating Generalized Biases of DNNs through the Intrinsic Dimensions of Perceptual Manifolds [46.47992213722412]
Building fair deep neural networks (DNNs) is a crucial step towards achieving trustworthy artificial intelligence.
We propose Intrinsic Dimension Regularization (IDR), which enhances the fairness and performance of models.
In various image recognition benchmark tests, IDR significantly mitigates model bias while improving its performance.
arXiv Detail & Related papers (2024-04-22T04:16:40Z) - Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning [24.200516684111175]
Shortcut learning is ubiquitous among many failure cases of neural networks.
Finding a unified solution for shortcut learning in DNNs is not out of reach, and TDA can play a significant role in forming such a framework.
arXiv Detail & Related papers (2024-02-17T10:02:22Z) - Adversarial Machine Learning in Latent Representations of Neural
Networks [9.372908891132772]
Distributed deep neural networks (DNNs) have been shown to reduce the computational burden of mobile devices and decrease the end-to-end inference latency in edge computing scenarios.
This paper rigorously analyzes the robustness of distributed DNNs against adversarial action.
arXiv Detail & Related papers (2023-09-29T17:01:29Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - On the Relationship Between Adversarial Robustness and Decision Region
in Deep Neural Network [26.656444835709905]
We study the internal properties of Deep Neural Networks (DNNs) that affect model robustness under adversarial attacks.
We propose the novel concept of the Populated Region Set (PRS), where training samples are populated more frequently.
arXiv Detail & Related papers (2022-07-07T16:06:34Z) - On the Intrinsic Structures of Spiking Neural Networks [66.57589494713515]
Recent years have emerged a surge of interest in SNNs owing to their remarkable potential to handle time-dependent and event-driven data.
There has been a dearth of comprehensive studies examining the impact of intrinsic structures within spiking computations.
This work delves deep into the intrinsic structures of SNNs, by elucidating their influence on the expressivity of SNNs.
arXiv Detail & Related papers (2022-06-21T09:42:30Z) - Individual Treatment Effect Estimation Through Controlled Neural Network
Training in Two Stages [0.757024681220677]
We develop a Causal-Deep Neural Network model trained in two stages to infer causal impact estimates at an individual unit level.
We observe that CDNN is highly competitive and often yields the most accurate individual treatment effect estimates.
arXiv Detail & Related papers (2022-01-21T06:34:52Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.