Taylor Learning
- URL: http://arxiv.org/abs/2305.14606v1
- Date: Wed, 24 May 2023 01:10:58 GMT
- Title: Taylor Learning
- Authors: James Schmidt
- Abstract summary: Empirical risk minimization stands behind most optimization in supervised machine learning.
We introduce a learning algorithm to construct models for real analytic functions using neither gradient descent nor empirical risk minimization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Empirical risk minimization stands behind most optimization in supervised
machine learning. Under this scheme, labeled data is used to approximate an
expected cost (risk), and a learning algorithm updates model-defining
parameters in search of an empirical risk minimizer, with the aim of thereby
approximately minimizing expected cost. Parameter update is often done by some
sort of gradient descent. In this paper, we introduce a learning algorithm to
construct models for real analytic functions using neither gradient descent nor
empirical risk minimization. Observing that such functions are defined by local
information, we situate familiar Taylor approximation methods in the context of
sampling data from a distribution, and prove a nonuniform learning result.
Related papers
- Symmetric Q-learning: Reducing Skewness of Bellman Error in Online
Reinforcement Learning [55.75959755058356]
In deep reinforcement learning, estimating the value function is essential to evaluate the quality of states and actions.
A recent study suggested that the error distribution for training the value function is often skewed because of the properties of the Bellman operator.
We proposed a method called Symmetric Q-learning, in which the synthetic noise generated from a zero-mean distribution is added to the target values to generate a Gaussian error distribution.
arXiv Detail & Related papers (2024-03-12T14:49:19Z) - Nonparametric Linear Feature Learning in Regression Through Regularisation [0.0]
We propose a novel method for joint linear feature learning and non-parametric function estimation.
By using alternative minimisation, we iteratively rotate the data to improve alignment with leading directions.
We establish that the expected risk of our method converges to the minimal risk under minimal assumptions and with explicit rates.
arXiv Detail & Related papers (2023-07-24T12:52:55Z) - Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles [25.557803548119466]
We provide sharp upper and lower bounds for the minimax excess risk of strongly convex and smooth statistical learning.
This novel class of oracles can query the gradient with any given data distribution.
arXiv Detail & Related papers (2023-07-10T16:29:05Z) - A Tale of Sampling and Estimation in Discounted Reinforcement Learning [50.43256303670011]
We present a minimax lower bound on the discounted mean estimation problem.
We show that estimating the mean by directly sampling from the discounted kernel of the Markov process brings compelling statistical properties.
arXiv Detail & Related papers (2023-04-11T09:13:17Z) - Improved Convergence Rates for Sparse Approximation Methods in
Kernel-Based Learning [48.08663378234329]
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications.
Existing sparse approximation methods can yield a significant reduction in the computational cost.
We provide novel confidence intervals for the Nystr"om method and the sparse variational Gaussian processes approximation method.
arXiv Detail & Related papers (2022-02-08T17:22:09Z) - Robust supervised learning with coordinate gradient descent [0.0]
We introduce a combination of coordinate gradient descent as a learning algorithm together with robust estimators of the partial derivatives.
This leads to robust statistical learning methods that have a numerical complexity nearly identical to non-robust ones.
arXiv Detail & Related papers (2022-01-31T17:33:04Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Effective Proximal Methods for Non-convex Non-smooth Regularized
Learning [27.775096437736973]
We show that the independent sampling scheme tends to improve performance of the commonly-used uniform sampling scheme.
Our new analysis also derives a speed for the sampling than best one available so far.
arXiv Detail & Related papers (2020-09-14T16:41:32Z) - Principled learning method for Wasserstein distributionally robust
optimization with local perturbations [21.611525306059985]
Wasserstein distributionally robust optimization (WDRO) attempts to learn a model that minimizes the local worst-case risk in the vicinity of the empirical data distribution.
We propose a minimizer based on a novel approximation theorem and provide the corresponding risk consistency results.
Our results show that the proposed method achieves significantly higher accuracy than baseline models on noisy datasets.
arXiv Detail & Related papers (2020-06-05T09:32:37Z) - Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation [49.502277468627035]
This paper studies the statistical theory of batch data reinforcement learning with function approximation.
Consider the off-policy evaluation problem, which is to estimate the cumulative value of a new target policy from logged history.
arXiv Detail & Related papers (2020-02-21T19:20:57Z) - Orthogonal Statistical Learning [49.55515683387805]
We provide non-asymptotic excess risk guarantees for statistical learning in a setting where the population risk depends on an unknown nuisance parameter.
We show that if the population risk satisfies a condition called Neymanity, the impact of the nuisance estimation error on the excess risk bound achieved by the meta-algorithm is of second order.
arXiv Detail & Related papers (2019-01-25T02:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.