Related papers: Learning the gravitational force law and other analytic functions

Learning the gravitational force law and other analytic functions

URL: http://arxiv.org/abs/2005.07724v1
Date: Fri, 15 May 2020 18:11:48 GMT
Title: Learning the gravitational force law and other analytic functions
Authors: Atish Agarwala, Abhimanyu Das, Rina Panigrahy, Qiuyi Zhang
Abstract summary: We show that a wide, one-hidden layer ReLU network can learn analytic functions with a number of samples proportional to the derivative of a related function. As an example, we prove explicit bounds on learning the many-body gravitational force function given by Newton's law of gravitation. We present experimental evidence that the many-body gravitational force function is easier to learn with ReLU networks as compared to networks with exponential activations.
Score: 12.673062890348312
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large neural network models have been successful in learning functions of importance in many branches of science, including physics, chemistry and biology. Recent theoretical work has shown explicit learning bounds for wide networks and kernel methods on some simple classes of functions, but not on more complex functions which arise in practice. We extend these techniques to provide learning bounds for analytic functions on the sphere for any kernel method or equivalent infinitely-wide network with the corresponding activation function trained with SGD. We show that a wide, one-hidden layer ReLU network can learn analytic functions with a number of samples proportional to the derivative of a related function. Many functions important in the sciences are therefore efficiently learnable. As an example, we prove explicit bounds on learning the many-body gravitational force function given by Newton's law of gravitation. Our theoretical bounds suggest that very wide ReLU networks (and the corresponding NTK kernel) are better at learning analytic functions as compared to kernel learning with Gaussian kernels. We present experimental evidence that the many-body gravitational force function is easier to learn with ReLU networks as compared to networks with exponential activations.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
DimOL: Dimensional Awareness as A New 'Dimension' in Operator Learning [63.5925701087252]
We introduce DimOL (Dimension-aware Operator Learning), drawing insights from dimensional analysis. To implement DimOL, we propose the ProdLayer, which can be seamlessly integrated into FNO-based and Transformer-based PDE solvers. Empirically, DimOL models achieve up to 48% performance gain within the PDE datasets.
arXiv Detail & Related papers (2024-10-08T10:48:50Z)
Data-driven discovery of Green's functions [0.0]
This thesis introduces theoretical results and deep learning algorithms to learn Green's functions associated with linear partial differential equations. The construction connects the fields of PDE learning and numerical linear algebra. Rational neural networks (NNs) are introduced and consist of neural networks with trainable rational activation functions.
arXiv Detail & Related papers (2022-10-28T09:41:50Z)
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications. We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA) Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z)
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z)
Optimal Approximation with Sparse Neural Networks and Applications [0.0]
We use deep sparsely connected neural networks to measure the complexity of a function class in $L(mathbb Rd)$. We also introduce representation system - a countable collection of functions to guide neural networks. We then analyse the complexity of a class called $beta$ cartoon-like functions using rate-distortion theory and wedgelets construction.
arXiv Detail & Related papers (2021-08-14T05:14:13Z)
Towards Lower Bounds on the Depth of ReLU Neural Networks [7.355977594790584]
We investigate whether the class of exactly representable functions strictly increases by adding more layers. We settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative. We present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.
arXiv Detail & Related papers (2021-05-31T09:49:14Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
A Use of Even Activation Functions in Neural Networks [0.35172332086962865]
We propose an alternative approach to integrate existing knowledge or hypotheses of data structure by constructing custom activation functions. We show that using an even activation function in one of the fully connected layers improves neural network performance.
arXiv Detail & Related papers (2020-11-23T20:33:13Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.