Efficient Estimation in NPIV Models: A Comparison of Various Neural
Networks-Based Estimators
- URL: http://arxiv.org/abs/2110.06763v2
- Date: Thu, 14 Oct 2021 15:07:40 GMT
- Title: Efficient Estimation in NPIV Models: A Comparison of Various Neural
Networks-Based Estimators
- Authors: Jiafeng Chen, Xiaohong Chen, Elie Tamer
- Abstract summary: We investigate the computational performance of Artificial Neural Networks (ANNs) in semi-nonparametric instrumental variables (NPIV) models.
We focus on efficient estimation of expectation and use ANN to approximate the unknown function.
We conduct a large set of Monte Carlo experiments that compares the finite-sample performance in complicated designs.
- Score: 1.4000007799304268
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate the computational performance of Artificial Neural Networks
(ANNs) in semi-nonparametric instrumental variables (NPIV) models of high
dimensional covariates that are relevant to empirical work in economics. We
focus on efficient estimation of and inference on expectation functionals (such
as weighted average derivatives) and use optimal criterion-based procedures
(sieve minimum distance or SMD) and novel efficient score-based procedures
(ES). Both these procedures use ANN to approximate the unknown function. Then,
we provide a detailed practitioner's recipe for implementing these two classes
of estimators. This involves the choice of tuning parameters both for the
unknown functions (that include conditional expectations) but also for the
choice of estimation of the optimal weights in SMD and the Riesz representers
used with the ES estimators. Finally, we conduct a large set of Monte Carlo
experiments that compares the finite-sample performance in complicated designs
that involve a large set of regressors (up to 13 continuous), and various
underlying nonlinearities and covariate correlations. Some of the takeaways
from our results include: 1) tuning and optimization are delicate especially as
the problem is nonconvex; 2) various architectures of the ANNs do not seem to
matter for the designs we consider and given proper tuning, ANN methods perform
well; 3) stable inferences are more difficult to achieve with ANN estimators;
4) optimal SMD based estimators perform adequately; 5) there seems to be a gap
between implementation and approximation theory. Finally, we apply ANN NPIV to
estimate average price elasticity and average derivatives in two demand
examples.
Related papers
- LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models [76.8317443926908]
Masked Diffusion Models (MDMs) present a promising paradigm for language modeling.<n>The challenge arises from the high variance in Evidence Lower Bound (ELBO)-based likelihood estimates required for preference optimization.<n>We propose Variance-Reduced Preference Optimization (VRPO), a framework that formally analyzes the variance of ELBO estimators and derives on both the bias and variance of preference optimization gradients.
arXiv Detail & Related papers (2025-05-25T16:36:20Z) - Distributionally Robust Optimization as a Scalable Framework to Characterize Extreme Value Distributions [22.765095010254118]
The goal of this paper is to develop distributionally robust optimization (DRO) estimators, specifically for multidimensional Extreme Value Theory (EVT) statistics.
In order to mitigate over-conservative estimates while enhancing out-of-sample performance, we study DRO estimators informed by semi-parametric max-stable constraints in the space of point processes.
Both approaches are validated using synthetically generated data, recovering prescribed characteristics, and verifying the efficacy of the proposed techniques.
arXiv Detail & Related papers (2024-07-31T19:45:27Z) - BO4IO: A Bayesian optimization approach to inverse optimization with uncertainty quantification [5.031974232392534]
This work addresses data-driven inverse optimization (IO)
The goal is to estimate unknown parameters in an optimization model from observed decisions that can be assumed to be optimal or near-optimal.
arXiv Detail & Related papers (2024-05-28T06:52:17Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP)
Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN.
It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - The Bias-Variance Tradeoff of Doubly Robust Estimator with Targeted
$L_1$ regularized Neural Networks Predictions [0.0]
The Doubly Robust (DR) estimation of ATE can be carried out in 2 steps, where in the first step, the treatment and outcome are modeled, and in the second step the predictions are inserted into the DR estimator.
The model misspecification in the first step has led researchers to utilize Machine Learning algorithms instead of parametric algorithms.
arXiv Detail & Related papers (2021-08-02T15:41:27Z) - Non-Asymptotic Performance Guarantees for Neural Estimation of
$\mathsf{f}$-Divergences [22.496696555768846]
Statistical distances quantify the dissimilarity between probability distributions.
A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it.
This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs.
arXiv Detail & Related papers (2021-03-11T19:47:30Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.