Related papers: Fundamental Novel Consistency Theory: $H$-Consistency Bounds

Fundamental Novel Consistency Theory: $H$-Consistency Bounds

URL: http://arxiv.org/abs/2512.22880v1
Date: Sun, 28 Dec 2025 11:02:20 GMT
Title: Fundamental Novel Consistency Theory: $H$-Consistency Bounds
Authors: Yutao Zhong,
Abstract summary: In machine learning, the loss functions optimized during training often differ from the target loss that defines task performance.<n>We present an in-depth study of the target loss estimation error relative to the surrogate loss estimation error.<n>Our analysis leads to $H$-consistency bounds, which are guarantees accounting for the hypothesis set $H$.
Score: 19.493449206135296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In machine learning, the loss functions optimized during training often differ from the target loss that defines task performance due to computational intractability or lack of differentiability. We present an in-depth study of the target loss estimation error relative to the surrogate loss estimation error. Our analysis leads to $H$-consistency bounds, which are guarantees accounting for the hypothesis set $H$. These bounds offer stronger guarantees than Bayes-consistency or $H$-calibration and are more informative than excess error bounds. We begin with binary classification, establishing tight distribution-dependent and -independent bounds. We provide explicit bounds for convex surrogates (including linear models and neural networks) and analyze the adversarial setting for surrogates like $ρ$-margin and sigmoid loss. Extending to multi-class classification, we present the first $H$-consistency bounds for max, sum, and constrained losses, covering both non-adversarial and adversarial scenarios. We demonstrate that in some cases, non-trivial $H$-consistency bounds are unattainable. We also investigate comp-sum losses (e.g., cross-entropy, MAE), deriving their first $H$-consistency bounds and introducing smooth adversarial variants that yield robust learning algorithms. We develop a comprehensive framework for deriving these bounds across various surrogates, introducing new characterizations for constrained and comp-sum losses. Finally, we examine the growth rates of $H$-consistency bounds, establishing a universal square-root growth rate for smooth surrogates in binary and multi-class tasks, and analyze minimizability gaps to guide surrogate selection.

Related papers

Is Softmax Loss All You Need? A Principled Analysis of Softmax-family Loss [91.61796429377041]
The Softmax loss is one of the most widely employed surrogate objectives for classification and ranking tasks.<n>We investigate whether different surrogates achieve consistency with classification and ranking metrics, and analyze their gradient dynamics to reveal distinct convergence behaviors.<n>Our results establish a principled foundation and offer practical guidance for loss selections in large-class machine learning applications.
arXiv Detail & Related papers (2026-01-30T09:24:52Z)
Non-Stationary Online Structured Prediction with Surrogate Losses [22.848052937383244]
We prove a bound of the form $F_T + C (1 + P_T)$ on the cumulative target loss.<n>Our core idea is to synthesize the dynamic regret bound of the gradient online descent (OGD) with the technique of exploiting the surrogate gap.<n>Our analysis also sheds light on a new Polyak-style learning rate for OGD, which systematically offers target-loss guarantees and exhibits promising empirical performance.
arXiv Detail & Related papers (2025-10-08T14:43:44Z)
Mastering Multiple-Expert Routing: Realizable $H$-Consistency and Strong Guarantees for Learning to Defer [30.389055604165222]
This paper introduces novel surrogate loss functions and efficient algorithms with strong theoretical learning guarantees.<n>We address open questions regarding realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for both single-stage and two-stage learning scenarios.<n>We derive new surrogate losses that achieve realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for the two-expert scenario and, under natural assumptions, multiple-expert scenario.
arXiv Detail & Related papers (2025-06-25T17:48:58Z)
NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z)
Of Dice and Games: A Theory of Generalized Boosting [61.752303337418475]
We extend the celebrated theory of boosting to incorporate both cost-sensitive and multi-objective losses.<n>We develop a comprehensive theory of cost-sensitive and multi-objective boosting, providing a taxonomy of weak learning guarantees.<n>Our characterization relies on a geometric interpretation of boosting, revealing a surprising equivalence between cost-sensitive and multi-objective losses.
arXiv Detail & Related papers (2024-12-11T01:38:32Z)
Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer [30.389055604165222]
We introduce a broad family of surrogate losses, parameterized by a non-increasing function $Psi$, and establish their realizable $H$-consistency under mild conditions. For cost functions based on classification error, we show that these losses admit $H$-consistency bounds when the hypothesis set is symmetric and complete.
arXiv Detail & Related papers (2024-07-18T17:35:03Z)
A Universal Growth Rate for Learning with Smooth Surrogate Losses [30.389055604165222]
We prove a square-root growth rate near zero for smooth margin-based surrogate losses in binary classification. We extend this analysis to multi-class classification with a series of novel results.
arXiv Detail & Related papers (2024-05-09T17:59:55Z)
Top-$k$ Classification and Cardinality-Aware Prediction [30.389055604165222]
We show that comp-sum and constrained losses are supported by $H$-consistency bounds with respect to the top-$k$ loss. We introduce cardinality-aware loss functions through instance-dependent cost-sensitive learning. Minimizing these losses leads to new cardinality-aware algorithms for top-$k$ classification.
arXiv Detail & Related papers (2024-03-28T17:45:03Z)
Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity [55.29408396918968]
We study a family of loss functions named label-distributionally robust (LDR) losses for multi-class classification. Our contributions include both consistency and robustness by establishing top-$k$ consistency of LDR losses for multi-class classification. We propose a new adaptive LDR loss that automatically adapts the individualized temperature parameter to the noise degree of class label of each instance.
arXiv Detail & Related papers (2021-12-30T00:27:30Z)
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions. Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure. In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z)
Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry. Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.