Stochastic $k$-Submodular Bandits with Full Bandit Feedback
- URL: http://arxiv.org/abs/2412.10682v1
- Date: Sat, 14 Dec 2024 05:02:53 GMT
- Title: Stochastic $k$-Submodular Bandits with Full Bandit Feedback
- Authors: Guanyu Nie, Vaneet Aggarwal, Christopher John Quinn,
- Abstract summary: We present the first sublinear $alpha$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback.
A key contribution of our work is analyzing the robustness of the algorithms.
- Score: 29.705337940879705
- License:
- Abstract: In this paper, we present the first sublinear $\alpha$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback, where $\alpha$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$-submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monotone functions with matroid constraints, (iii) non-monotone functions with matroid constraints, (iv) non-monotone functions without constraints, and (v) monotone functions without constraints. We transform approximation algorithms for offline $k$-submodular maximization problems into online algorithms through the offline-to-online framework proposed by Nie et al. (2023a). A key contribution of our work is analyzing the robustness of the offline algorithms.
Related papers
- Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
We present a $textbfMA-OSMA$ algorithm to transfer the discrete submodular problem into a continuous optimization.
We also introduce a projection-free $textbfMA-OSEA$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution.
Our algorithms significantly improve the $(frac11+c)$-approximation provided by the state-of-the-art OSG algorithm.
arXiv Detail & Related papers (2025-02-07T15:57:56Z) - Federated Combinatorial Multi-Agent Multi-Armed Bandits [79.1700188160944]
This paper introduces a federated learning framework tailored for online optimization with bandit.
In this setting, agents subsets of arms, observe noisy rewards for these subsets without accessing individual arm information, and can cooperate and share information at specific intervals.
arXiv Detail & Related papers (2024-05-09T17:40:09Z) - Bridging the Gap Between General and Down-Closed Convex Sets in
Submodular Maximization [8.225819874406238]
We show that Mualem citemualem23re shows that this approach cannot smooth between down- and non-down-closed constraints.
In this work, we suggest novel offline and online algorithms based on a natural decomposition of the body into two distinct convex bodies.
We also empirically demonstrate the superiority of our proposed algorithms across three offline and two online applications.
arXiv Detail & Related papers (2024-01-17T14:56:42Z) - Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed.
SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z) - Dynamic Non-monotone Submodular Maximization [11.354502646593607]
We show a reduction from maximizing a non-monotone submodular function under the cardinality constraint $k$ to maximizing a monotone submodular function under the same constraint.
Our algorithms maintain an $(epsilon)$-approximate of the solution and use expected amortized $O(epsilon-3k3log3(n)log(k)$ queries per update.
arXiv Detail & Related papers (2023-11-07T03:20:02Z) - Randomized Greedy Learning for Non-monotone Stochastic Submodular
Maximization Under Full-bandit Feedback [98.29086113546045]
We investigate the problem of unconstrained multi-armed bandits with full-bandit feedback and rewards for submodularity.
We show that RGL empirically outperforms other full-bandit variants in submodular and non-submodular settings.
arXiv Detail & Related papers (2023-02-02T18:52:14Z) - A Framework for Adapting Offline Algorithms to Solve Combinatorial
Multi-Armed Bandit Problems with Bandit Feedback [27.192028744078282]
We provide a framework for adapting discrete offline approximation algorithms into sublinear $alpha$-regret methods.
The proposed framework is applied to diverse applications in submodular horizon.
arXiv Detail & Related papers (2023-01-30T23:18:06Z) - Online Nonsubmodular Minimization with Delayed Costs: From Full
Information to Bandit Feedback [98.7678704343537]
We focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms.
We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded.
arXiv Detail & Related papers (2022-05-15T08:27:12Z) - Submodular + Concave [53.208470310734825]
It has been well established that first order optimization methods can converge to the maximal objective value of concave functions.
In this work, we initiate the determinant of the smooth functions convex body $$F(x) = G(x) +C(x)$.
This class of functions is an extension of both concave and continuous DR-submodular functions for which no guarantee is known.
arXiv Detail & Related papers (2021-06-09T01:59:55Z) - Fast and Private Submodular and $k$-Submodular Functions Maximization
with Matroid Constraints [27.070004659301162]
We study the problem of maximizing monotone submodular functions subject to matroid constraints in the framework of differential privacy.
We give the first $frac12$-approximation algorithm that preserves $k$submodular functions subject to matroid constraints.
arXiv Detail & Related papers (2020-06-28T23:18:58Z) - Submodular Bandit Problem Under Multiple Constraints [8.100450025624443]
We introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint.
To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound.
We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast algorithm.
arXiv Detail & Related papers (2020-06-01T01:28:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.