The gap between theory and practice in function approximation with deep
neural networks
- URL: http://arxiv.org/abs/2001.07523v3
- Date: Mon, 15 Feb 2021 23:42:03 GMT
- Title: The gap between theory and practice in function approximation with deep
neural networks
- Authors: Ben Adcock and Nick Dexter
- Abstract summary: Deep learning (DL) is transforming industry as decision-making processes are being automated by deep neural networks (DNNs) trained on real-world data.
We introduce a computational framework for examining DNNs in practice, and use it to study empirical performance with regard to these issues.
- Score: 2.969705152497174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning (DL) is transforming industry as decision-making processes are
being automated by deep neural networks (DNNs) trained on real-world data.
Driven partly by rapidly-expanding literature on DNN approximation theory
showing they can approximate a rich variety of functions, such tools are
increasingly being considered for problems in scientific computing. Yet, unlike
traditional algorithms in this field, little is known about DNNs from the
principles of numerical analysis, e.g., stability, accuracy, computational
efficiency and sample complexity. In this paper we introduce a computational
framework for examining DNNs in practice, and use it to study empirical
performance with regard to these issues. We study performance of DNNs of
different widths & depths on test functions in various dimensions, including
smooth and piecewise smooth functions. We also compare DL against best-in-class
methods for smooth function approx. based on compressed sensing (CS). Our main
conclusion from these experiments is that there is a crucial gap between the
approximation theory of DNNs and their practical performance, with trained DNNs
performing relatively poorly on functions for which there are strong
approximation results (e.g. smooth functions), yet performing well in
comparison to best-in-class methods for other functions. To analyze this gap
further, we provide some theoretical insights. We establish a practical
existence theorem, asserting existence of a DNN architecture and training
procedure that offers the same performance as CS. This establishes a key
theoretical benchmark, showing the gap can be closed, albeit via a strategy
guaranteed to perform as well as, but no better than, current best-in-class
schemes. Nevertheless, it demonstrates the promise of practical DNN approx., by
highlighting potential for better schemes through careful design of DNN
architectures and training strategies.
Related papers
- Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks [0.9749638953163389]
Learning approximations to smooth target functions of many variables from finite sets of pointwise samples is an important task in scientific computing.
Significant advances have been made in the last decade towards efficient methods for doing this.
Recent advances have been made in the relevant approximation theory and analysis of these techniques.
arXiv Detail & Related papers (2024-04-04T19:07:21Z) - Efficient kernel surrogates for neural network-based regression [0.8030359871216615]
We study the performance of the Conjugate Kernel (CK), an efficient approximation to the Neural Tangent Kernel (NTK)
We show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior.
In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework suggests a recipe for improving DNN accuracy inexpensively.
arXiv Detail & Related papers (2023-10-28T06:41:47Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Leveraging The Topological Consistencies of Learning in Deep Neural
Networks [0.0]
We define a new class of topological features that accurately characterize the progress of learning while being quick to compute during running time.
Our proposed topological features are readily equipped for backpropagation, meaning that they can be incorporated in end-to-end training.
arXiv Detail & Related papers (2021-11-30T18:34:48Z) - Reinforcement Learning with External Knowledge by using Logical Neural
Networks [67.46162586940905]
A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic.
We propose an integrated method that enables model-free reinforcement learning from external knowledge sources.
arXiv Detail & Related papers (2021-03-03T12:34:59Z) - Advantage of Deep Neural Networks for Estimating Functions with
Singularity on Hypersurfaces [23.21591478556582]
We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods.
This study tries to fill this gap by considering the estimation for a class of non-smooth functions that have singularities on hypersurfaces.
arXiv Detail & Related papers (2020-11-04T12:51:14Z) - Optimization and Generalization Analysis of Transduction through
Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing.
Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem.
We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z) - Self-Directed Online Machine Learning for Topology Optimization [58.920693413667216]
Self-directed Online Learning Optimization integrates Deep Neural Network (DNN) with Finite Element Method (FEM) calculations.
Our algorithm was tested by four types of problems including compliance minimization, fluid-structure optimization, heat transfer enhancement and truss optimization.
It reduced the computational time by 2 5 orders of magnitude compared with directly using methods, and outperformed all state-of-the-art algorithms tested in our experiments.
arXiv Detail & Related papers (2020-02-04T20:00:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.