Minimax learning rates for estimating binary classifiers under margin conditions
- URL: http://arxiv.org/abs/2505.10628v1
- Date: Thu, 15 May 2025 18:05:10 GMT
- Title: Minimax learning rates for estimating binary classifiers under margin conditions
- Authors: Jonathan GarcĂa, Philipp Petersen,
- Abstract summary: We study classification problems using binary estimators where the decision boundary is described by horizon functions.<n>We establish upper and lower bounds for the minimax learning rate over broad function classes with bounded Kolmogorov entropy in Lebesgue norms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study classification problems using binary estimators where the decision boundary is described by horizon functions and where the data distribution satisfies a geometric margin condition. We establish upper and lower bounds for the minimax learning rate over broad function classes with bounded Kolmogorov entropy in Lebesgue norms. A key novelty of our work is the derivation of lower bounds on the worst-case learning rates under a geometric margin condition -- a setting that is almost universally satisfied in practice but remains theoretically challenging. Moreover, our results deal with the noiseless setting, where lower bounds are particularly hard to establish. We apply our general results to classification problems with decision boundaries belonging to several function classes: for Barron-regular functions, and for H\"older-continuous functions with strong margins, we identify optimal rates close to the fast learning rates of $\mathcal{O}(n^{-1})$ for $n \in \mathbb{N}$ samples. Also for merely convex decision boundaries, in a strong margin case optimal rates near $\mathcal{O}(n^{-1/2})$ can be achieved.
Related papers
- High-dimensional classification problems with Barron regular boundaries under margin conditions [0.0]
In particular, for strong margin conditions, high-dimensional discontinuous classifiers can be approximated with a rate that is typically only achievable when approximating a low-dimensional smooth function.<n>We demonstrate how these expression rate bounds imply fast-rate learning bounds that are close to $n-1$ where $n$ is the number of samples.
arXiv Detail & Related papers (2024-12-10T08:50:35Z) - Learning to Cover: Online Learning and Optimization with Irreversible Decisions [50.5775508521174]
We define an online learning and optimization problem with discrete and irreversible decisions contributing toward a coverage target.<n>In each period, a decision-maker selects facilities to open, receives information on the success of each one, and updates a classification model to guide future decisions.<n>The goal is to minimize facility openings under a chance constraint reflecting the coverage target, in an horizon regime characterized by a large target number of facilities.
arXiv Detail & Related papers (2024-06-20T23:00:25Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Adversarial Contextual Bandits Go Kernelized [21.007410990554522]
We study a generalization of the problem of online learning in adversarial linear contextual bandits by incorporating loss functions that belong to a Hilbert kernel space.
We propose a new optimistically biased estimator for the loss functions and reproducing near-optimal regret guarantees.
arXiv Detail & Related papers (2023-10-02T19:59:39Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints [7.9047096855236125]
We show that textbfOBCD offers stronger optimality than standard critical points.<n>We also demonstrate the non-erg convergence rate of textbfOBCD.
arXiv Detail & Related papers (2023-04-07T13:44:59Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Optimal learning of high-dimensional classification problems using deep
neural networks [0.0]
We study the problem of learning classification functions from noiseless training samples, under the assumption that the decision boundary is of a certain regularity.
For the class of locally Barron-regular decision boundaries, we find that the optimal estimation rates are essentially independent of the underlying dimension.
arXiv Detail & Related papers (2021-12-23T14:15:10Z) - First-Order Regret in Reinforcement Learning with Linear Function
Approximation: A Robust Estimation Approach [57.570201404222935]
We show that it is possible to obtain regret scaling as $mathcalO(sqrtV_1star K)$ in reinforcement learning with large state spaces.
We demonstrate that existing techniques based on at least squares estimation are insufficient to obtain this result.
arXiv Detail & Related papers (2021-12-07T00:29:57Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z) - High-probability Bounds for Non-Convex Stochastic Optimization with
Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails.
We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.