Related papers: Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds

Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds

URL: http://arxiv.org/abs/2003.03490v2
Date: Mon, 26 Apr 2021 09:17:56 GMT
Title: Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds
Authors: Ehsan Emamjomeh-Zadeh, Chen-Yu Wei, Haipeng Luo, David Kempe
Abstract summary: We revisit the problem of online learning with sleeping experts/bandits. In each time step, only a subset of the actions are available for the algorithm to choose from. We give an algorithm that provides a no-approximate-regret guarantee for the general sleeping expert/bandit problems.
Score: 48.312484940846
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We revisit the problem of online learning with sleeping experts/bandits: in each time step, only a subset of the actions are available for the algorithm to choose from (and learn about). The work of Kleinberg et al. (2010) showed that there exist no-regret algorithms which perform no worse than the best ranking of actions asymptotically. Unfortunately, achieving this regret bound appears computationally hard: Kanade and Steinke (2014) showed that achieving this no-regret performance is at least as hard as PAC-learning DNFs, a notoriously difficult problem. In the present work, we relax the original problem and study computationally efficient no-approximate-regret algorithms: such algorithms may exceed the optimal cost by a multiplicative constant in addition to the additive regret. We give an algorithm that provides a no-approximate-regret guarantee for the general sleeping expert/bandit problems. For several canonical special cases of the problem, we give algorithms with significantly better approximation ratios; these algorithms also illustrate different techniques for achieving no-approximate-regret guarantees.

Related papers

Computational Intractability of Strategizing against Online Learners [7.225039292056237]
We show that a standard no-regret algorithm can compute a near-optimal strategy against a learner. This establishes a fundamental computational barrier to finding optimal strategies in general game-theoretic settings.
arXiv Detail & Related papers (2025-03-06T08:28:50Z)
Learning Arithmetic Formulas in the Presence of Noise: A General Framework and Applications to Unsupervised Learning [4.10375234787249]
We present a framework for designing efficient algorithms for unsupervised learning problems. Our framework is based on a meta algorithm that learns arithmetic circuits in the presence of noise.
arXiv Detail & Related papers (2023-11-13T12:26:25Z)
Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem. In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z)
Dual Algorithmic Reasoning [9.701208207491879]
We propose to learn algorithms by exploiting duality of the underlying algorithmic problem. We demonstrate that simultaneously learning the dual definition of these optimisation problems in algorithmic learning allows for better learning. We then validate the real-world utility of our dual algorithmic reasoner by deploying it on a challenging brain vessel classification task.
arXiv Detail & Related papers (2023-02-09T08:46:23Z)
Causal Bandits without Graph Learning [28.021500949026766]
We develop an efficient algorithm for finding the parent node of the reward node using atomic interventions. We extend our algorithm to the case when the reward node has multiple parents.
arXiv Detail & Related papers (2023-01-26T20:27:14Z)
Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers [12.09273192079783]
We consider interactive learning in the realizable setting and develop a general framework to handle problems ranging from best arm identification to active classification. We design novel computationally efficient algorithms for the realizable setting that match the minimax lower bound up to logarithmic factors.
arXiv Detail & Related papers (2021-11-09T02:33:36Z)
Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Corralling Stochastic Bandit Algorithms [54.10645564702416]
We show that the regret of the corralling algorithms is no worse than that of the best algorithm containing the arm with the highest reward. We show that the gap between the highest reward and other rewards depends on the gap between the highest reward and other rewards.
arXiv Detail & Related papers (2020-06-16T15:33:12Z)
TopRank+: A Refinement of TopRank Algorithm [0.0]
A novel online learning algorithm was proposed based on topological sorting. In this work, we utilize method of mixtures and expansions of certain implicit function to provide a tighter, iterated-log-like boundary for the inequalities.
arXiv Detail & Related papers (2020-01-21T15:44:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.