Randomized Greedy Learning for Non-monotone Stochastic Submodular
Maximization Under Full-bandit Feedback
- URL: http://arxiv.org/abs/2302.01324v1
- Date: Thu, 2 Feb 2023 18:52:14 GMT
- Title: Randomized Greedy Learning for Non-monotone Stochastic Submodular
Maximization Under Full-bandit Feedback
- Authors: Fares Fourati, Vaneet Aggarwal, Christopher John Quinn, Mohamed-Slim
Alouini
- Abstract summary: We investigate the problem of unconstrained multi-armed bandits with full-bandit feedback and rewards for submodularity.
We show that RGL empirically outperforms other full-bandit variants in submodular and non-submodular settings.
- Score: 98.29086113546045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the problem of unconstrained combinatorial multi-armed bandits
with full-bandit feedback and stochastic rewards for submodular maximization.
Previous works investigate the same problem assuming a submodular and monotone
reward function. In this work, we study a more general problem, i.e., when the
reward function is not necessarily monotone, and the submodularity is assumed
only in expectation. We propose Randomized Greedy Learning (RGL) algorithm and
theoretically prove that it achieves a $\frac{1}{2}$-regret upper bound of
$\tilde{\mathcal{O}}(n T^{\frac{2}{3}})$ for horizon $T$ and number of arms
$n$. We also show in experiments that RGL empirically outperforms other
full-bandit variants in submodular and non-submodular settings.
Related papers
- Fair Submodular Cover [18.37610521373708]
We present the study of Fair Submodular Cover (FSC), where given a ground set $U$, a monotone submodular function $f:2UtomathbbR_ge 0$, a threshold $tau$.
We first introduce discrete algorithms for FSC that achieve a bicriteria approximation ratio of $(frac1epsilon, 1-O(epsilon))$.
We then present a continuous algorithm that achieves a $(frac1epsilon, 1-O(epsilon))$-
arXiv Detail & Related papers (2024-07-05T18:37:09Z) - Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed.
SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z) - Dynamic Non-monotone Submodular Maximization [11.354502646593607]
We show a reduction from maximizing a non-monotone submodular function under the cardinality constraint $k$ to maximizing a monotone submodular function under the same constraint.
Our algorithms maintain an $(epsilon)$-approximate of the solution and use expected amortized $O(epsilon-3k3log3(n)log(k)$ queries per update.
arXiv Detail & Related papers (2023-11-07T03:20:02Z) - Bandit Multi-linear DR-Submodular Maximization and Its Applications on
Adversarial Submodular Bandits [21.54858035450694]
We give a sublinear regret algorithm for the submodular bandit with partition matroid constraint.
For the bandit sequential submodular, the existing work proves an $O(T2/3)$ regret with a suboptimal $1/2$ approximation ratio.
arXiv Detail & Related papers (2023-05-21T08:51:55Z) - Top $K$ Ranking for Multi-Armed Bandit with Noisy Evaluations [102.32996053572144]
We consider a multi-armed bandit setting where, at the beginning of each round, the learner receives noisy independent evaluations of the true reward of each arm.
We derive different algorithmic approaches and theoretical guarantees depending on how the evaluations are generated.
arXiv Detail & Related papers (2021-12-13T09:48:54Z) - On Submodular Contextual Bandits [92.45432756301231]
We consider the problem of contextual bandits where actions are subsets of a ground set and mean rewards are modeled by an unknown monotone submodular function.
We show that our algorithm efficiently randomizes around local optima of estimated functions according to the Inverse Gap Weighting strategy.
arXiv Detail & Related papers (2021-12-03T21:42:33Z) - Improved Regret Bounds for Online Submodular Maximization [10.089520556398575]
We consider an online optimization problem where at each step $tin[T]$, the algorithm chooses an action $x_t$ from the fixed convex and compact domain set $mathcalK$.
A utility function $f_t(cdot)$ is then revealed and the algorithm receives the payoff $f_t(x_t)$.
arXiv Detail & Related papers (2021-06-15T02:05:35Z) - The Power of Subsampling in Submodular Maximization [51.629656762796564]
We show that this approach leads to optimal/state-of-the-art results despite being much simpler than existing methods.
We empirically demonstrate the effectiveness of our algorithms on video summarization, location summarization, and movie recommendation tasks.
arXiv Detail & Related papers (2021-04-06T20:25:57Z) - Regularized Submodular Maximization at Scale [45.914693923126826]
Submodularity is inherently related to the notions of diversity, coverage, and representativeness.
We propose methods for maximizing a regularized submodular function $f = g ell$ expressed as the difference between a determinant submodular function $g$ and a modular function $ell$.
arXiv Detail & Related papers (2020-02-10T02:37:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.