A new approach to generalisation error of machine learning algorithms:
Estimates and convergence
- URL: http://arxiv.org/abs/2306.13784v1
- Date: Fri, 23 Jun 2023 20:57:31 GMT
- Title: A new approach to generalisation error of machine learning algorithms:
Estimates and convergence
- Authors: Michail Loulakis, Charalambos G. Makridakis
- Abstract summary: We introduce a new approach to the estimation of the (generalisation) error and to convergence.
Our results include estimates of the error without any structural assumption on the neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this work we consider a model problem of deep neural learning, namely the
learning of a given function when it is assumed that we have access to its
point values on a finite set of points. The deep neural network interpolant is
the the resulting approximation of f, which is obtained by a typical machine
learning algorithm involving a given DNN architecture and an optimisation step,
which is assumed to be solved exactly. These are among the simplest regression
algorithms based on neural networks. In this work we introduce a new approach
to the estimation of the (generalisation) error and to convergence. Our results
include (i) estimates of the error without any structural assumption on the
neural networks and under mild regularity assumptions on the learning function
f (ii) convergence of the approximations to the target function f by only
requiring that the neural network spaces have appropriate approximation
capability.
Related papers
- SEF: A Method for Computing Prediction Intervals by Shifting the Error Function in Neural Networks [0.0]
SEF (Shifting the Error Function) method presented in this paper is a new method that belongs to this category of methods.
The proposed approach involves training a single neural network three times, thus generating an estimate along with the corresponding upper and lower bounds for a given problem.
This innovative process effectively produces PIs, resulting in a robust and efficient technique for uncertainty quantification.
arXiv Detail & Related papers (2024-09-08T19:46:45Z) - SGD method for entropy error function with smoothing l0 regularization for neural networks [3.108634881604788]
entropy error function has been widely used in neural networks.
We propose a novel entropy function with smoothing l0 regularization for feed-forward neural networks.
Our work is novel as it enables neural networks to learn effectively, producing more accurate predictions.
arXiv Detail & Related papers (2024-05-28T19:54:26Z) - The limitation of neural nets for approximation and optimization [0.0]
We are interested in assessing the use of neural networks as surrogate models to approximate and minimize objective functions in optimization problems.
Our study begins by determining the best activation function for approximating the objective functions of popular nonlinear optimization test problems.
arXiv Detail & Related papers (2023-11-21T00:21:15Z) - HNS: An Efficient Hermite Neural Solver for Solving Time-Fractional
Partial Differential Equations [12.520882780496738]
We present the high-precision Hermite Neural Solver (HNS) for solving time-fractional partial differential equations.
The experimental results show that HNS has significantly improved accuracy and flexibility compared to existing L1-based methods.
arXiv Detail & Related papers (2023-10-07T12:44:47Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - A Recursively Recurrent Neural Network (R2N2) Architecture for Learning
Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms.
We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Multigoal-oriented dual-weighted-residual error estimation using deep
neural networks [0.0]
Deep learning is considered as a powerful tool with high flexibility to approximate functions.
Our approach is based on a posteriori error estimation in which the adjoint problem is solved for the error localization.
An efficient and easy to implement algorithm is developed to obtain a posteriori error estimate for multiple goal functionals.
arXiv Detail & Related papers (2021-12-21T16:59:44Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.