Improved Bounds for Swap Multicalibration and Swap Omniprediction
- URL: http://arxiv.org/abs/2505.20885v2
- Date: Wed, 28 May 2025 08:05:07 GMT
- Title: Improved Bounds for Swap Multicalibration and Swap Omniprediction
- Authors: Haipeng Luo, Spandan Senapati, Vatsal Sharan,
- Abstract summary: We show that it is possible to efficiently achieve $O(sqrtT)$ $ell_2$-swap multicalibration error against bounded linear functions.<n>We also establish a $O(varepsilon -3)$ sample complexity of efficiently learning an $varepsilon$-swap omnipredictor for the class of convex and Lipschitz functions.
- Score: 31.959887895880765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we consider the related problems of multicalibration -- a multigroup fairness notion and omniprediction -- a simultaneous loss minimization paradigm, both in the distributional and online settings. The recent work of Garg et al. (2024) raised the open problem of whether it is possible to efficiently achieve $O(\sqrt{T})$ $\ell_{2}$-multicalibration error against bounded linear functions. In this paper, we answer this question in a strongly affirmative sense. We propose an efficient algorithm that achieves $O(T^{\frac{1}{3}})$ $\ell_{2}$-swap multicalibration error (both in high probability and expectation). On propagating this bound onward, we obtain significantly improved rates for $\ell_{1}$-swap multicalibration and swap omniprediction for a loss class of convex Lipschitz functions. In particular, we show that our algorithm achieves $O(T^{\frac{2}{3}})$ $\ell_{1}$-swap multicalibration and swap omniprediction errors, thereby improving upon the previous best-known bound of $O(T^{\frac{7}{8}})$. As a consequence of our improved online results, we further obtain several improved sample complexity rates in the distributional setting. In particular, we establish a $O(\varepsilon ^ {-3})$ sample complexity of efficiently learning an $\varepsilon$-swap omnipredictor for the class of convex and Lipschitz functions, $O(\varepsilon ^{-2.5})$ sample complexity of efficiently learning an $\varepsilon$-swap agnostic learner for the squared loss, and $O(\varepsilon ^ {-5}), O(\varepsilon ^ {-2.5})$ sample complexities of learning $\ell_{1}, \ell_{2}$-swap multicalibrated predictors against linear functions, all of which significantly improve on the previous best-known bounds.
Related papers
- Robust learning of halfspaces under log-concave marginals [6.852292115526837]
We give an algorithm that learns linear threshold functions and returns a classifier with boundary volume $O(r+varepsilon)$ at radius perturbation $r$.<n>The time and sample complexity of $dtildeO (1/varepsilon2)$ matches the complexity of Boolean regression.
arXiv Detail & Related papers (2025-05-19T20:12:16Z) - Full Swap Regret and Discretized Calibration [18.944031222413294]
We study the problem of minimizing swap regret in structured normal-form games.<n>We introduce a new online learning problem we call emphfull swap regret minimization<n>We also apply these tools to the problem of online forecasting to calibration error.
arXiv Detail & Related papers (2025-02-13T13:49:52Z) - Robust Sparse Regression with Non-Isotropic Designs [4.964650614497048]
We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries.
We provide a novel analysis of weighted penalized Huber loss that is suitable for heavy-tailed designs in the presence of two adversaries.
arXiv Detail & Related papers (2024-10-31T13:51:59Z) - Quantum Algorithms for Non-smooth Non-convex Optimization [30.576546266390714]
This paper considers the problem for finding the $(,epsilon)$-Goldstein stationary point of Lipschitz continuous objective.
We construct a zeroth-order quantum estimator for the surrogate oracle function.
arXiv Detail & Related papers (2024-10-21T16:52:26Z) - Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Near-Optimal Non-Convex Stochastic Optimization under Generalized
Smoothness [21.865728815935665]
Two recent works established the $O(epsilon-3)$ sample complexity to obtain an $O(epsilon)$-stationary point.
However, both require a large batch size on the order of $mathrmploy(epsilon-1)$, which is not only computationally burdensome but also unsuitable for streaming applications.
In this work, we solve the prior two problems simultaneously by revisiting a simple variant of the STORM algorithm.
arXiv Detail & Related papers (2023-02-13T00:22:28Z) - ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density.
We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls.
Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.