Related papers: Deep Learning for Individual Heterogeneity: An Automatic Inference Framework

Deep Learning for Individual Heterogeneity: An Automatic Inference Framework

URL: http://arxiv.org/abs/2010.14694v2
Date: Fri, 23 Jul 2021 19:34:50 GMT
Title: Deep Learning for Individual Heterogeneity: An Automatic Inference Framework
Authors: Max H. Farrell and Tengyuan Liang and Sanjog Misra
Abstract summary: We develop methodology for estimation and inference using machine learning to enrich economic models. We show how to design the network architecture to match the structure of the economic model. We obtain inference based on a novel influence function calculation.
Score: 2.6813717321945107
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop methodology for estimation and inference using machine learning to enrich economic models. Our framework takes a standard economic model and recasts the parameters as fully flexible nonparametric functions, to capture the rich heterogeneity based on potentially high dimensional or complex observable characteristics. These "parameter functions" retain the interpretability, economic meaning, and discipline of classical parameters. Deep learning is particularly well-suited to structured modeling of heterogeneity in economics. We show how to design the network architecture to match the structure of the economic model, delivering novel methodology that moves deep learning beyond prediction. We prove convergence rates for the estimated parameter functions. These functions are the key inputs into the finite-dimensional parameter of inferential interest. We obtain inference based on a novel influence function calculation that covers any second-stage parameter and any machine-learning-enriched model that uses a smooth per-observation loss function. No additional derivations are required. The score can be taken directly to data, using automatic differentiation if needed. The researcher need only define the original model and define the parameter of interest. A key insight is that we need not write down the influence function in order to evaluate it on the data. Our framework gives new results for a host of contexts, covering such diverse examples as price elasticities, willingness-to-pay, and surplus measures in binary or multinomial choice models, effects of continuous treatment variables, fractional outcome models, count data, heterogeneous production functions, and more. We apply our methodology to a large scale advertising experiment for short-term loans. We show how economically meaningful estimates and inferences can be made that would be unavailable without our results.

Related papers

Generative Flexible Latent Structure Regression (GFLSR) model [0.5586073503694489]
This paper proposes a Generative Flexible Latent Structure Regression (GFLSR) model structure to address this problem.<n>We show that most linear continuous latent variable methods can be represented under the proposed framework.<n>With a model structure, we analyse the convergence of the parameters and the latent variables.
arXiv Detail & Related papers (2025-08-06T12:37:45Z)
Diffusion Factor Models: Generating High-Dimensional Returns with Factor Structure [13.929007993061564]
We propose a diffusion factor model that integrates latent factor structure into generative diffusion processes. By exploiting the low-dimensional factor structure inherent in asset returns, we decompose the score function. We derive rigorous statistical guarantees, establishing nonasymptotic error bounds for both score estimation.
arXiv Detail & Related papers (2025-04-09T04:01:35Z)
Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
The Buffer Mechanism for Multi-Step Information Reasoning in Language Models [52.77133661679439]
Investigating internal reasoning mechanisms of large language models can help us design better model architectures and training strategies. In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy. We proposed a random matrix-based algorithm to enhance the model's reasoning ability, resulting in a 75% reduction in the training time required for the GPT-2 model.
arXiv Detail & Related papers (2024-05-24T07:41:26Z)
Statistical learning for constrained functional parameters in infinite-dimensional models with applications in fair machine learning [4.974815773537217]
We study the general problem of constrained statistical machine learning through a statistical functional lens. We characterize the constrained functional parameter as the minimizer of a penalized risk criterion using a Lagrange multiplier formulation. Our results suggest natural estimators of the constrained parameter that can be constructed by combining estimates of unconstrained parameters.
arXiv Detail & Related papers (2024-04-15T14:59:21Z)
On the Foundations of Shortcut Learning [20.53986437152018]
We study how predictivity and availability interact to shape models' feature use. We find that linear models are relatively unbiased, but introducing a single hidden layer with ReLU or Tanh units yields a bias.
arXiv Detail & Related papers (2023-10-24T22:54:05Z)
Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations. We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z)
Choice Models and Permutation Invariance: Demand Estimation in Differentiated Products Markets [5.8429701619765755]
We demonstrate how non-parametric estimators like neural nets can easily approximate choice functions. Our proposed functionals can flexibly capture underlying consumer behavior in a completely data-driven fashion. Our empirical analysis confirms that the estimator generates realistic and comparable own- and cross-price elasticities.
arXiv Detail & Related papers (2023-07-13T23:24:05Z)
On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. Based on these observations, we propose a conceptual framework for feature learning. Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing. We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
A Free Lunch with Influence Functions? Improving Neural Network Estimates with Concepts from Semiparametric Statistics [41.99023989695363]
We explore the potential for semiparametric theory to be used to improve neural networks and machine learning algorithms. We propose a new neural network method MultiNet, which seeks the flexibility and diversity of an ensemble using a single architecture.
arXiv Detail & Related papers (2022-02-18T09:35:51Z)
Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA [4.340954888479091]
We introduce a new general identifiable framework for principled disentanglement referred to as Structured Independent Component Analysis (SNICA) Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. We establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution.
arXiv Detail & Related papers (2021-06-17T15:56:57Z)
MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio. We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z)
Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn. We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
Structural Regularization [0.0]
We propose a novel method for modeling data by using structural models based on economic theory as regularizers for statistical models. We show that our method can outperform both the (misspecified) structural model and un-structural-regularized statistical models.
arXiv Detail & Related papers (2020-04-27T06:47:07Z)
Semi-Structured Distributional Regression -- Extending Structured Additive Models by Arbitrary Deep Neural Networks and Data Modalities [0.0]
We propose a general framework to combine structured regression models and deep neural networks into a unifying network architecture. We demonstrate the framework's efficacy in numerical experiments and illustrate its special merits in benchmarks and real-world applications.
arXiv Detail & Related papers (2020-02-13T21:01:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.