Related papers: Beyond Tsybakov: Model Margin Noise and $\mathcal{H}$-Consistency Bounds

Beyond Tsybakov: Model Margin Noise and $\mathcal{H}$-Consistency Bounds

URL: http://arxiv.org/abs/2511.15816v1
Date: Wed, 19 Nov 2025 19:13:39 GMT
Title: Beyond Tsybakov: Model Margin Noise and $\mathcal{H}$-Consistency Bounds
Authors: Mehryar Mohri, Yutao Zhong,
Abstract summary: We introduce a new low-noise condition for classification, the Model Margin Noise (MM noise) assumption.<n>We derive enhanced $mathcalH$-consistency bounds for both binary and multi-class classification.
Score: 42.67092904252001
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a new low-noise condition for classification, the Model Margin Noise (MM noise) assumption, and derive enhanced $\mathcal{H}$-consistency bounds under this condition. MM noise is weaker than Tsybakov noise condition: it is implied by Tsybakov noise condition but can hold even when Tsybakov fails, because it depends on the discrepancy between a given hypothesis and the Bayes-classifier rather than on the intrinsic distributional minimal margin (see Figure 1 for an illustration of an explicit example). This hypothesis-dependent assumption yields enhanced $\mathcal{H}$-consistency bounds for both binary and multi-class classification. Our results extend the enhanced $\mathcal{H}$-consistency bounds of Mao, Mohri, and Zhong (2025a) with the same favorable exponents but under a weaker assumption than the Tsybakov noise condition; they interpolate smoothly between linear and square-root regimes for intermediate noise levels. We also instantiate these bounds for common surrogate loss families and provide illustrative tables.

Related papers

Skewness-Robust Causal Discovery in Location-Scale Noise Models [47.09233752567902]
We propose SkewD, a likelihood-based algorithm for causal discovery under location-scale noise models.<n>SkewD extends the usual normal-distribution framework to the skew-normal setting, enabling reliable inference under symmetric and skewed noise.<n>We evaluate SkewD on novel synthetically generated datasets with skewed noise as well as established benchmark datasets.
arXiv Detail & Related papers (2025-11-18T12:40:41Z)
Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises [55.43924214633558]
In this paper, we focus on two types of noises: one is sub-Weibull noises, and the other is SsBC noises.<n>Under these two noise assumptions, the in-expectation and high-probability convergence of SFOMs have been studied in the contexts of convex optimization and smooth optimization.
arXiv Detail & Related papers (2025-07-17T16:48:45Z)
Regularized least squares learning with heavy-tailed noise is minimax optimal [22.406170258823803]
This paper examines the performance of ridge regression in kernel Hilbert spaces in the presence of noise that exhibits a finite number of reproducing higher moments.<n>We establish risk bounds consisting of subgaussian and excess terms based on the well known integral operator framework.
arXiv Detail & Related papers (2025-05-20T11:17:54Z)
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees [56.80920351680438]
We study high-probability convergence in online learning, in the presence of heavy-tailed noise.<n>We provide guarantees for a broad class of nonlinearities, without any assumptions on noise moments.
arXiv Detail & Related papers (2024-10-17T18:25:28Z)
Revisiting Convergence of AdaGrad with Relaxed Assumptions [4.189643331553922]
We revisit the convergence of AdaGrad with momentum (covering AdaGrad as a special case) on problems. This model encompasses a broad range noises including sub-auau in many practical applications.
arXiv Detail & Related papers (2024-02-21T13:24:14Z)
Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density. We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order. We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z)
Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise [64.85879194013407]
We prove the first high-probability results with logarithmic dependence on the confidence level for methods for solving monotone and structured non-monotone VIPs. Our results match the best-known ones in the light-tails case and are novel for structured non-monotone problems. In addition, we numerically validate that the gradient noise of many practical formulations is heavy-tailed and show that clipping improves the performance of SEG/SGDA.
arXiv Detail & Related papers (2022-06-02T15:21:55Z)
Robust Learning under Strong Noise via SQs [5.9256596453465225]
We show that every SQ learnable class admits an efficient learning algorithm with OPT + $epsilon misilon misclassification error for a broad class of noise models. This setting substantially generalizes the widely-studied problem classification under RCN with known noise probabilities.
arXiv Detail & Related papers (2020-10-18T21:02:26Z)
Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models. We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise. Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.