Related papers: Replicable Online Learning

Replicable Online Learning

URL: http://arxiv.org/abs/2411.13730v1
Date: Wed, 20 Nov 2024 22:10:37 GMT
Title: Replicable Online Learning
Authors: Saba Ahmadi, Siddharth Bhandari, Avrim Blum,
Abstract summary: We investigate the concept of algorithmic replicability introduced by Impagliazzo and Ghazi. In our model, the input sequence received by the online learner is generated from time-varying distributions chosen by an adversary. Our objective is to design low-regret online algorithms that, with high probability, produce the exact same sequence of actions when run on two independently sampled input sequences.
Score: 12.14234796585091
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate the concept of algorithmic replicability introduced by Impagliazzo et al. 2022, Ghazi et al. 2021, Ahn et al. 2024 in an online setting. In our model, the input sequence received by the online learner is generated from time-varying distributions chosen by an adversary (obliviously). Our objective is to design low-regret online algorithms that, with high probability, produce the exact same sequence of actions when run on two independently sampled input sequences generated as described above. We refer to such algorithms as adversarially replicable. Previous works (such as Esfandiari et al. 2022) explored replicability in the online setting under inputs generated independently from a fixed distribution; we term this notion as iid-replicability. Our model generalizes to capture both adversarial and iid input sequences, as well as their mixtures, which can be modeled by setting certain distributions as point-masses. We demonstrate adversarially replicable online learning algorithms for online linear optimization and the experts problem that achieve sub-linear regret. Additionally, we propose a general framework for converting an online learner into an adversarially replicable one within our setting, bounding the new regret in terms of the original algorithm's regret. We also present a nearly optimal (in terms of regret) iid-replicable online algorithm for the experts problem, highlighting the distinction between the iid and adversarial notions of replicability. Finally, we establish lower bounds on the regret (in terms of the replicability parameter and time) that any replicable online algorithm must incur.

Related papers

Online Learning and Unlearning [56.770023668379615]
We present two online learner-unlearner (OLU) algorithms, both built upon online gradient descent (OGD)<n>The first, passive OLU, leverages OGD's contractive property and injects noise when unlearning occurs, incurring no additional computation.<n>The second, active OLU, uses an offline unlearning algorithm that shifts the model toward a solution excluding the deleted data.
arXiv Detail & Related papers (2025-05-13T13:33:36Z)
The Benefit of Being Bayesian in Online Conformal Prediction [7.713245413733777]
We study the online construction of confidence sets given a black-box machine learning model.<n>By converting the target confidence levels into quantile levels, the problem can be reduced to predicting the quantiles of a sequentially revealed data sequence.
arXiv Detail & Related papers (2024-10-03T15:04:47Z)
On the Computational Landscape of Replicable Learning [40.274579987732416]
We study computational aspects of algorithmic replicability, a notion of stability introduced by Impagliazzo, Lei, Pitassi, and Sorrell. Motivated by a recent line of work that established strong statistical connections between replicability and other notions of learnability, we aim to understand better the computational connections between replicability and these learning paradigms.
arXiv Detail & Related papers (2024-05-24T14:30:40Z)
Replication-proof Bandit Mechanism Design with Bayesian Agents [11.758708370032469]
We study the problem of designing replication-proof bandit mechanisms when agents strategically register or replicate their own arms. We consider Bayesian agents who only know the distribution from which their own arms' mean rewards are sampled, unlike the original setting of by Shin et al. 2022.
arXiv Detail & Related papers (2023-12-28T08:36:35Z)
Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$. Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z)
A Batch-to-Online Transformation under Random-Order Model [23.817140289575377]
We introduce a transformation framework that can be utilized to develop online algorithms with low $epsilon$-approximate regret. We apply it to various problems, including online $(k,z)$-clustering, online matrix approximation, and online regression. Our algorithm also enjoys low inconsistency, which may be desired in some online applications.
arXiv Detail & Related papers (2023-06-12T14:50:21Z)
Replicable Reinforcement Learning [15.857503103543308]
We provide a provably replicable algorithm for parallel value iteration, and a provably replicable version of R-max in the episodic setting. These are the first formal replicability results for control problems, which present different challenges for replication than batch learning settings.
arXiv Detail & Related papers (2023-05-24T16:05:15Z)
Online Regenerative Learning [0.0]
We study a type of Online Linear Programming (OLP) problem that maximizes the objective function with inputs. The performance of various algorithms that analyze this type of OLP is well studied when the inputs follow some i.i.d distribution.
arXiv Detail & Related papers (2022-09-18T21:04:56Z)
Smoothed Online Learning is as Easy as Statistical Learning [77.00766067963195]
We provide the first oracle-efficient, no-regret algorithms in this setting. We show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits.
arXiv Detail & Related papers (2022-02-09T19:22:34Z)
Discovering Non-monotonic Autoregressive Orderings with Variational Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
Active Online Learning with Hidden Shifting Domains [64.75186088512034]
We propose a surprisingly simple algorithm that adaptively balances its regret and its number of label queries. Our algorithm can adaptively deal with interleaving spans of inputs from different domains.
arXiv Detail & Related papers (2020-06-25T15:23:59Z)
Competitive Mirror Descent [67.31015611281225]
Constrained competitive optimization involves multiple agents trying to minimize conflicting objectives, subject to constraints. We propose competitive mirror descent (CMD): a general method for solving such problems based on first order information. As a special case we obtain a novel competitive multiplicative weights algorithm for problems on the positive cone.
arXiv Detail & Related papers (2020-06-17T22:11:35Z)
A Modern Introduction to Online Learning [15.974402990630402]
Online learning refers to the framework of minimization of regret under worst-case assumptions. I present first-order and second-order algorithms for online learning with convex losses.
arXiv Detail & Related papers (2019-12-31T08:16:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.