Related papers: Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks

Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks

URL: http://arxiv.org/abs/2505.01995v1
Date: Sun, 04 May 2025 05:40:45 GMT
Title: Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks
Authors: Sehwan Kim, Faming Liang,
Abstract summary: This work introduces the Double Neural Network (Double-NN) method to address the problem of individual treatment effect estimation.<n>Deep neural networks are used to model the treatment and control effect functions, while an additional neural network is employed to estimate their parameters.<n> Numerical results highlight the superior performance of the proposed Double-NN method compared to the conformal quantile regression (CQR) method in individual treatment effect estimation.
Score: 7.916654052803723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Individual treatment effect estimation has gained significant attention in recent data science literature. This work introduces the Double Neural Network (Double-NN) method to address this problem within the framework of extended fiducial inference (EFI). In the proposed method, deep neural networks are used to model the treatment and control effect functions, while an additional neural network is employed to estimate their parameters. The universal approximation capability of deep neural networks ensures the broad applicability of this method. Numerical results highlight the superior performance of the proposed Double-NN method compared to the conformal quantile regression (CQR) method in individual treatment effect estimation. From the perspective of statistical inference, this work advances the theory and methodology for statistical inference of large models. Specifically, it is theoretically proven that the proposed method permits the model size to increase with the sample size $n$ at a rate of $O(n^{\zeta})$ for some $0 \leq \zeta<1$, while still maintaining proper quantification of uncertainty in the model parameters. This result marks a significant improvement compared to the range $0\leq \zeta < \frac{1}{2}$ required by the classical central limit theorem. Furthermore, this work provides a rigorous framework for quantifying the uncertainty of deep neural networks under the neural scaling law, representing a substantial contribution to the statistical understanding of large-scale neural network models.

Related papers

Certified Neural Approximations of Nonlinear Dynamics [52.79163248326912]
In safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system.<n>We propose a novel, adaptive, and parallelizable verification method based on certified first-order models.
arXiv Detail & Related papers (2025-05-21T13:22:20Z)
Optimizing Deep Neural Networks using Safety-Guided Self Compression [0.0]
This study introduces a novel safety-driven quantization framework that prunes and quantizes neural network weights.<n>The proposed methodology is rigorously evaluated on both a convolutional neural network (CNN) and an attention-based language model.<n> Experimental results reveal that our framework achieves up to a 2.5% enhancement in test accuracy relative to the original unquantized models.
arXiv Detail & Related papers (2025-05-01T06:50:30Z)
SEF: A Method for Computing Prediction Intervals by Shifting the Error Function in Neural Networks [0.0]
SEF (Shifting the Error Function) method presented in this paper is a new method that belongs to this category of methods. The proposed approach involves training a single neural network three times, thus generating an estimate along with the corresponding upper and lower bounds for a given problem. This innovative process effectively produces PIs, resulting in a robust and efficient technique for uncertainty quantification.
arXiv Detail & Related papers (2024-09-08T19:46:45Z)
Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization [45.72323731094864]
We present a theoretical framework to analyze two-layer neural network-based diffusion models. We prove that training shallow neural networks for score prediction can be done by solving a single convex program. Our results provide a precise characterization of what neural network-based diffusion models learn in non-asymptotic settings.
arXiv Detail & Related papers (2024-02-03T00:20:25Z)
Tractable Function-Space Variational Inference in Bayesian Neural Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters. We propose a scalable function-space variational inference method that allows incorporating prior information. We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
Efficient Bayes Inference in Neural Networks through Adaptive Importance Sampling [19.518237361775533]
In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This feature is useful in countless machine learning applications. It is particularly appealing in areas where decision-making has a crucial impact, such as medical healthcare or autonomous driving.
arXiv Detail & Related papers (2022-10-03T14:59:23Z)
ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret [97.73233271730616]
Recent techniques for approximating Nash equilibria in very large games leverage neural networks to learn approximately optimal policies (strategies) DREAM, the only current CFR-based neural method that is model free and therefore scalable to very large games, trains a neural network on an estimated regret target that can have extremely high variance due to an importance sampling term inherited from Monte Carlo CFR (MCCFR) We show that a deep learning version of ESCHER outperforms the prior state of the art -- DREAM and neural fictitious self play (NFSP) -- and the difference becomes dramatic as game size increases.
arXiv Detail & Related papers (2022-06-08T18:43:45Z)
Multi-Sample Online Learning for Spiking Neural Networks based on Generalized Expectation Maximization [42.125394498649015]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains by processing through binary neural dynamic activations. This paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights. The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient.
arXiv Detail & Related papers (2021-02-05T16:39:42Z)
Multi-fidelity Bayesian Neural Networks: Algorithms and Applications [0.0]
We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity. We apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs)
arXiv Detail & Related papers (2020-12-19T02:03:53Z)
Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process. This gives us a better understanding of the implicit prior NNs place on function space. We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z)
Measurement error models: from nonparametric methods to deep neural networks [3.1798318618973362]
We propose an efficient neural network design for estimating measurement error models. We use a fully connected feed-forward neural network to approximate the regression function $f(x)$. We conduct an extensive numerical study to compare the neural network approach with classical nonparametric methods.
arXiv Detail & Related papers (2020-07-15T06:05:37Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.