Nonuniform random feature models using derivative information
- URL: http://arxiv.org/abs/2410.02132v1
- Date: Thu, 3 Oct 2024 01:30:13 GMT
- Title: Nonuniform random feature models using derivative information
- Authors: Konstantin Pieper, Zezhong Zhang, Guannan Zhang,
- Abstract summary: We propose nonuniform data-driven parameter distributions for neural network initialization based on derivative data of the function to be approximated.
We address the cases of Heaviside and ReLU activation functions, and their smooth approximations (sigmoid and softplus)
We suggest simplifications of these exact densities based on approximate derivative data in the input points that allow for very efficient sampling and lead to performance of random feature models close to optimal networks in several scenarios.
- Score: 10.239175197655266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose nonuniform data-driven parameter distributions for neural network initialization based on derivative data of the function to be approximated. These parameter distributions are developed in the context of non-parametric regression models based on shallow neural networks, and compare favorably to well-established uniform random feature models based on conventional weight initialization. We address the cases of Heaviside and ReLU activation functions, and their smooth approximations (sigmoid and softplus), and use recent results on the harmonic analysis and sparse representation of neural networks resulting from fully trained optimal networks. Extending analytic results that give exact representation, we obtain densities that concentrate in regions of the parameter space corresponding to neurons that are well suited to model the local derivatives of the unknown function. Based on these results, we suggest simplifications of these exact densities based on approximate derivative data in the input points that allow for very efficient sampling and lead to performance of random feature models close to optimal networks in several scenarios.
Related papers
- Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization [45.72323731094864]
We present a theoretical framework to analyze two-layer neural network-based diffusion models.
We prove that training shallow neural networks for score prediction can be done by solving a single convex program.
Our results provide a precise characterization of what neural network-based diffusion models learn in non-asymptotic settings.
arXiv Detail & Related papers (2024-02-03T00:20:25Z) - Universal approximation property of Banach space-valued random feature models including random neural networks [3.3379026542599934]
We introduce a Banach space-valued extension of random feature learning.
By randomly initializing the feature maps, only the linear readout needs to be trained.
We derive approximation rates and an explicit algorithm to learn an element of the given Banach space.
arXiv Detail & Related papers (2023-12-13T11:27:15Z) - A probabilistic, data-driven closure model for RANS simulations with aleatoric, model uncertainty [1.8416014644193066]
We propose a data-driven, closure model for Reynolds-averaged Navier-Stokes (RANS) simulations that incorporates aleatoric, model uncertainty.
A fully Bayesian formulation is proposed, combined with a sparsity-inducing prior in order to identify regions in the problem domain where the parametric closure is insufficient.
arXiv Detail & Related papers (2023-07-05T16:53:31Z) - Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks.
We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Demystifying Randomly Initialized Networks for Evaluating Generative
Models [28.8899914083501]
Evaluation of generative models is mostly based on the comparison between the estimated distribution and the ground truth distribution in a certain feature space.
To embed samples into informative features, previous works often use convolutional neural networks optimized for classification.
In this paper, we rigorously investigate the feature space of models with random weights in comparison to that of trained models.
arXiv Detail & Related papers (2022-08-19T08:43:53Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.