Higher-order Quasi-Monte Carlo Training of Deep Neural Networks
- URL: http://arxiv.org/abs/2009.02713v1
- Date: Sun, 6 Sep 2020 11:31:42 GMT
- Title: Higher-order Quasi-Monte Carlo Training of Deep Neural Networks
- Authors: M. Longo, S. Mishra, T. K. Rusch, Ch. Schwab
- Abstract summary: We present a novel algorithmic approach and an error analysis leveraging Quasi-Monte Carlo points for training deep neural network (DNN) surrogates of Data-to-Observable (DtO) maps in engineering design.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel algorithmic approach and an error analysis leveraging
Quasi-Monte Carlo points for training deep neural network (DNN) surrogates of
Data-to-Observable (DtO) maps in engineering design. Our analysis reveals
higher-order consistent, deterministic choices of training points in the input
data space for deep and shallow Neural Networks with holomorphic activation
functions such as tanh. These novel training points are proved to facilitate
higher-order decay (in terms of the number of training samples) of the
underlying generalization error, with consistency error bounds that are free
from the curse of dimensionality in the input data space, provided that DNN
weights in hidden layers satisfy certain summability conditions. We present
numerical experiments for DtO maps from elliptic and parabolic PDEs with
uncertain inputs that confirm the theoretical analysis.
Related papers
- Positional Encoder Graph Quantile Neural Networks for Geographic Data [4.277516034244117]
We introduce the Positional Graph Quantile Neural Network (PE-GQNN), a novel method that integrates PE-GNNs, Quantile Neural Networks, and recalibration techniques in a fully nonparametric framework.
Experiments on benchmark datasets demonstrate that PE-GQNN significantly outperforms existing state-of-the-art methods in both predictive accuracy and uncertainty quantification.
arXiv Detail & Related papers (2024-09-27T16:02:12Z) - Wide Neural Networks as Gaussian Processes: Lessons from Deep
Equilibrium Models [16.07760622196666]
We study the deep equilibrium model (DEQ), an infinite-depth neural network with shared weight matrices across layers.
Our analysis reveals that as the width of DEQ layers approaches infinity, it converges to a Gaussian process.
Remarkably, this convergence holds even when the limits of depth and width are interchanged.
arXiv Detail & Related papers (2023-10-16T19:00:43Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Dense Hebbian neural networks: a replica symmetric picture of supervised
learning [4.133728123207142]
We consider dense, associative neural-networks trained by a teacher with supervision.
We investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations.
arXiv Detail & Related papers (2022-11-25T13:37:47Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Variational Inference for Infinitely Deep Neural Networks [0.4061135251278187]
unbounded depth neural network (UDN)
We introduce the unbounded depth neural network (UDN), an infinitely deep probabilistic model that adapts its complexity to the training data.
We study the UDN on real and synthetic data.
arXiv Detail & Related papers (2022-09-21T03:54:34Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - On the Turnpike to Design of Deep Neural Nets: Explicit Depth Bounds [0.0]
This paper attempts a quantifiable answer to the question of how many layers should be considered in a Deep Neural Networks (DNN)
The underlying assumption is that the number of neurons per layer -- i.e., the width of the DNN -- is kept constant.
We prove explicit bounds on the required depths of DNNs based on reachability of assumptions and a dissipativity-inducing choice of the regularization terms in the training problem.
arXiv Detail & Related papers (2021-01-08T13:23:37Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.