Deep learning with missing data
- URL: http://arxiv.org/abs/2504.15388v2
- Date: Tue, 29 Apr 2025 10:07:53 GMT
- Title: Deep learning with missing data
- Authors: Tianyi Ma, Tengyao Wang, Richard J. Samworth,
- Abstract summary: We propose Pattern Embedded Neural Networks (PENNs), which can be applied in conjunction with any existing imputation technique.<n>In addition to a neural network trained on the imputed data, PENNs pass the vectors of observation indicators through a second neural network to provide a compact representation.<n>The outputs are then combined in a third neural network to produce final predictions.
- Score: 3.829599191332801
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the context of multivariate nonparametric regression with missing covariates, we propose Pattern Embedded Neural Networks (PENNs), which can be applied in conjunction with any existing imputation technique. In addition to a neural network trained on the imputed data, PENNs pass the vectors of observation indicators through a second neural network to provide a compact representation. The outputs are then combined in a third neural network to produce final predictions. Our main theoretical result exploits an assumption that the observation patterns can be partitioned into cells on which the Bayes regression function behaves similarly, and belongs to a compositional H\"older class. It provides a finite-sample excess risk bound that holds for an arbitrary missingness mechanism, and in combination with a complementary minimax lower bound, demonstrates that our PENN estimator attains in typical cases the minimax rate of convergence as if the cells of the partition were known in advance, up to a poly-logarithmic factor in the sample size. Numerical experiments on simulated, semi-synthetic and real data confirm that the PENN estimator consistently improves, often dramatically, on standard neural networks without pattern embedding. Code to reproduce our experiments, as well as a tutorial on how to apply our method, is publicly available.
Related papers
- Uncertainty propagation in feed-forward neural network models [3.987067170467799]
We develop new uncertainty propagation methods for feed-forward neural network architectures.<n>We derive analytical expressions for the probability density function (PDF) of the neural network output.<n>A key finding is that an appropriate linearization of the leaky ReLU activation function yields accurate statistical results.
arXiv Detail & Related papers (2025-03-27T00:16:36Z) - Semi-Supervised Deep Sobolev Regression: Estimation and Variable Selection by ReQU Neural Network [3.4623717820849476]
We propose SDORE, a Semi-supervised Deep Sobolev Regressor, for the nonparametric estimation of the underlying regression function and its gradient.<n>Our study includes a thorough analysis of the convergence rates of SDORE in $L2$-norm, achieving the minimax optimality.
arXiv Detail & Related papers (2024-01-09T13:10:30Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Bagged Polynomial Regression and Neural Networks [0.0]
Series and dataset regression are able to approximate the same function classes as neural networks.
textitbagged regression (BPR) is an attractive alternative to neural networks.
BPR performs as well as neural networks in crop classification using satellite data.
arXiv Detail & Related papers (2022-05-17T19:55:56Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Path classification by stochastic linear recurrent neural networks [2.5499055723658097]
We show that RNNs retain a partial signature of the paths they are fed as the unique information exploited for training and classification tasks.
We argue that these RNNs are easy to train and robust and back these observations with numerical experiments on both synthetic and real data.
arXiv Detail & Related papers (2021-08-06T12:59:12Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Multi-Sample Online Learning for Spiking Neural Networks based on
Generalized Expectation Maximization [42.125394498649015]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains by processing through binary neural dynamic activations.
This paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights.
The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient.
arXiv Detail & Related papers (2021-02-05T16:39:42Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.