Related papers: Approximate Replicability in Learning

Approximate Replicability in Learning

URL: http://arxiv.org/abs/2510.20200v1
Date: Thu, 23 Oct 2025 04:36:01 GMT
Title: Approximate Replicability in Learning
Authors: Max Hopkins, Russell Impagliazzo, Christopher Ye,
Abstract summary: We propose three natural relaxations of replicability in the context of PAC learning.<n>For constant replicability parameters, we obtain sample-optimal PAC learners.
Score: 5.613537675448949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Replicability, introduced by (Impagliazzo et al. STOC '22), is the notion that algorithms should remain stable under a resampling of their inputs (given access to shared randomness). While a strong and interesting notion of stability, the cost of replicability can be prohibitive: there is no replicable algorithm, for instance, for tasks as simple as threshold learning (Bun et al. STOC '23). Given such strong impossibility results we ask: under what approximate notions of replicability is learning possible? In this work, we propose three natural relaxations of replicability in the context of PAC learning: (1) Pointwise: the learner must be consistent on any fixed input, but not across all inputs simultaneously, (2) Approximate: the learner must output hypotheses that classify most of the distribution consistently, (3) Semi: the algorithm is fully replicable, but may additionally use shared unlabeled samples. In all three cases, for constant replicability parameters, we obtain sample-optimal agnostic PAC learners: (1) and (2) are achievable for ``free" using $\Theta(d/\alpha^2)$ samples, while (3) requires $\Theta(d^2/\alpha^2)$ labeled samples.

Related papers

The Role of Randomness in Stability [20.718747268949112]
We study the randomness complexity of two influential notions of stability in learning: replicability and differential privacy.<n>We prove a weak-to-strong' boosting theorem for stability: the randomness complexity of a task is tightly controlled by the best replication probability.
arXiv Detail & Related papers (2025-02-11T23:06:43Z)
Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.<n>We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift. We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution. We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z)
Replicability and stability in learning [16.936594801109557]
Impagliazzo, Lei, Pitassi and Sorrell (22) recently initiated the study of replicability in machine learning. We show how to boost any replicable algorithm so that it produces the same output with probability arbitrarily close to 1. We prove that list replicability can be boosted so that it is achieved with probability arbitrarily close to 1.
arXiv Detail & Related papers (2023-04-07T17:52:26Z)
List and Certificate Complexities in Replicable Learning [0.7829352305480285]
We consider two feasible notions of replicability called list replicability and certificate replicability. We design algorithms for certain learning problems that are optimal in list and certificate complexity.
arXiv Detail & Related papers (2023-04-05T06:05:27Z)
Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner. We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z)
On the Stability and Generalization of Triplet Learning [55.75784102837832]
Triplet learning, i.e. learning from triplet data, has attracted much attention in computer vision tasks. This paper investigates the generalization guarantees of triplet learning by leveraging the stability analysis.
arXiv Detail & Related papers (2023-02-20T07:32:50Z)
Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline. $nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z)
Learning versus Refutation in Noninteractive Local Differential Privacy [133.80204506727526]
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation. Our main result is a complete characterization of the sample complexity of PAC learning for non-interactive LDP protocols.
arXiv Detail & Related papers (2022-10-26T03:19:24Z)
Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise. We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
Proper Learning, Helly Number, and an Optimal SVM Bound [29.35254938542589]
We characterize classes for which the optimal sample complexity can be achieved by a proper learning algorithm. We show that the dual Helly number is bounded if and only if there is a proper joint dependence on $epsilon$ and $delta$.
arXiv Detail & Related papers (2020-05-24T18:11:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.