Learning Single-Index Models with Shallow Neural Networks
- URL: http://arxiv.org/abs/2210.15651v1
- Date: Thu, 27 Oct 2022 17:52:58 GMT
- Title: Learning Single-Index Models with Shallow Neural Networks
- Authors: Alberto Bietti, Joan Bruna, Clayton Sanford, Min Jae Song
- Abstract summary: We introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow.
We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.
- Score: 43.6480804626033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-index models are a class of functions given by an unknown univariate
``link'' function applied to an unknown one-dimensional projection of the
input. These models are particularly relevant in high dimension, when the data
might present low-dimensional structure that learning algorithms should adapt
to. While several statistical aspects of this model, such as the sample
complexity of recovering the relevant (one-dimensional) subspace, are
well-understood, they rely on tailored algorithms that exploit the specific
structure of the target function. In this work, we introduce a natural class of
shallow neural networks and study its ability to learn single-index models via
gradient flow. More precisely, we consider shallow networks in which biases of
the neurons are frozen at random initialization. We show that the corresponding
optimization landscape is benign, which in turn leads to generalization
guarantees that match the near-optimal sample complexity of dedicated
semi-parametric methods.
Related papers
- Robust Feature Learning for Multi-Index Models in High Dimensions [17.183648775698167]
We take the first steps towards understanding adversarially robust feature learning with neural networks.
We show that adversarially robust learning is just as easy as standard learning.
arXiv Detail & Related papers (2024-10-21T19:20:34Z) - Nonuniform random feature models using derivative information [10.239175197655266]
We propose nonuniform data-driven parameter distributions for neural network initialization based on derivative data of the function to be approximated.
We address the cases of Heaviside and ReLU activation functions, and their smooth approximations (sigmoid and softplus)
We suggest simplifications of these exact densities based on approximate derivative data in the input points that allow for very efficient sampling and lead to performance of random feature models close to optimal networks in several scenarios.
arXiv Detail & Related papers (2024-10-03T01:30:13Z) - On Learning Gaussian Multi-index Models with Gradient Flow [57.170617397894404]
We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data.
We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection.
arXiv Detail & Related papers (2023-10-30T17:55:28Z) - Generative Neural Fields by Mixtures of Neural Implicit Functions [43.27461391283186]
We propose a novel approach to learning the generative neural fields represented by linear combinations of implicit basis networks.
Our algorithm learns basis networks in the form of implicit neural representations and their coefficients in a latent space by either conducting meta-learning or adopting auto-decoding paradigms.
arXiv Detail & Related papers (2023-10-30T11:41:41Z) - Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks.
We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - Deep Randomized Neural Networks [12.333836441649343]
Randomized Neural Networks explore the behavior of neural systems where the majority of connections are fixed.
This chapter surveys all the major aspects regarding the design and analysis of Randomized Neural Networks.
arXiv Detail & Related papers (2020-02-27T17:57:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.