The Effects of Multi-Task Learning on ReLU Neural Network Functions
- URL: http://arxiv.org/abs/2410.21696v2
- Date: Tue, 05 Nov 2024 22:03:21 GMT
- Title: The Effects of Multi-Task Learning on ReLU Neural Network Functions
- Authors: Julia Nakhleh, Joseph Shenouda, Robert D. Nowak,
- Abstract summary: We show that neural network learning problems with large numbers of diverse tasks are approximately equivalent to an $ell2$ (Hilbert space) problem over a fixed kernel determined by the optimal neurons.
- Score: 17.786058035763254
- License:
- Abstract: This paper studies the properties of solutions to multi-task shallow ReLU neural network learning problems, wherein the network is trained to fit a dataset with minimal sum of squared weights. Remarkably, the solutions learned for each individual task resemble those obtained by solving a kernel method, revealing a novel connection between neural networks and kernel methods. It is known that single-task neural network training problems are equivalent to minimum norm interpolation problem in a non-Hilbertian Banach space, and that the solutions of such problems are generally non-unique. In contrast, we prove that the solutions to univariate-input, multi-task neural network interpolation problems are almost always unique, and coincide with the solution to a minimum-norm interpolation problem in a Sobolev (Reproducing Kernel) Hilbert Space. We also demonstrate a similar phenomenon in the multivariate-input case; specifically, we show that neural network learning problems with large numbers of diverse tasks are approximately equivalent to an $\ell^2$ (Hilbert space) minimization problem over a fixed kernel determined by the optimal neurons.
Related papers
- Newton Informed Neural Operator for Computing Multiple Solutions of Nonlinear Partials Differential Equations [3.8916312075738273]
We propose a novel approach called the Newton Informed Neural Operator to tackle nonlinearities.
Our method combines classical Newton methods, addressing well-posed problems, and efficiently learns multiple solutions in a single learning process.
arXiv Detail & Related papers (2024-05-23T01:52:54Z) - Deep multitask neural networks for solving some stochastic optimal
control problems [0.0]
In this paper, we consider a class of optimal control problems and introduce an effective solution employing neural networks.
To train our multitask neural network, we introduce a novel scheme that dynamically balances the learning across tasks.
Through numerical experiments on real-world derivatives pricing problems, we prove that our method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2024-01-23T17:20:48Z) - Zonotope Domains for Lagrangian Neural Network Verification [102.13346781220383]
We decompose the problem of verifying a deep neural network into the verification of many 2-layer neural networks.
Our technique yields bounds that improve upon both linear programming and Lagrangian-based verification techniques.
arXiv Detail & Related papers (2022-10-14T19:31:39Z) - Improved Training of Physics-Informed Neural Networks with Model
Ensembles [81.38804205212425]
We propose to expand the solution interval gradually to make the PINN converge to the correct solution.
All ensemble members converge to the same solution in the vicinity of observed data.
We show experimentally that the proposed method can improve the accuracy of the found solution.
arXiv Detail & Related papers (2022-04-11T14:05:34Z) - Physics informed neural networks for continuum micromechanics [68.8204255655161]
Recently, physics informed neural networks have successfully been applied to a broad variety of problems in applied mathematics and engineering.
Due to the global approximation, physics informed neural networks have difficulties in displaying localized effects and strong non-linear solutions by optimization.
It is shown, that the domain decomposition approach is able to accurately resolve nonlinear stress, displacement and energy fields in heterogeneous microstructures obtained from real-world $mu$CT-scans.
arXiv Detail & Related papers (2021-10-14T14:05:19Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable
domain decomposition approach for solving differential equations [20.277873724720987]
We propose a new, scalable approach for solving large problems relating to differential equations called Finite Basis PINNs (FBPINNs)
FBPINNs are inspired by classical finite element methods, where the solution of the differential equation is expressed as the sum of a finite set of basis functions with compact support.
In FBPINNs neural networks are used to learn these basis functions, which are defined over small, overlapping subdomain problems.
arXiv Detail & Related papers (2021-07-16T13:03:47Z) - Achieving Small Test Error in Mildly Overparameterized Neural Networks [30.664282759625948]
We show an algorithm which finds one of these points in time.
In addition, we prove that for a fully connected neural net, with an additional assumption on the data distribution, there is a time algorithm.
arXiv Detail & Related papers (2021-04-24T06:47:20Z) - Conditional physics informed neural networks [85.48030573849712]
We introduce conditional PINNs (physics informed neural networks) for estimating the solution of classes of eigenvalue problems.
We show that a single deep neural network can learn the solution of partial differential equations for an entire class of problems.
arXiv Detail & Related papers (2021-04-06T18:29:14Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Banach Space Representer Theorems for Neural Networks and Ridge Splines [17.12783792226575]
We develop a variational framework to understand the properties of the functions learned by neural networks fit to data.
We derive a representer theorem showing that finite-width, single-hidden layer neural networks are solutions to inverse problems.
arXiv Detail & Related papers (2020-06-10T02:57:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.