Submodular Bandit Problem Under Multiple Constraints
- URL: http://arxiv.org/abs/2006.00661v5
- Date: Mon, 29 Mar 2021 02:02:19 GMT
- Title: Submodular Bandit Problem Under Multiple Constraints
- Authors: Sho Takemori, Masahiro Sato, Takashi Sonoda, Janmajay Singh, Tomoko
Ohkuma
- Abstract summary: We introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint.
To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound.
We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast algorithm.
- Score: 8.100450025624443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The linear submodular bandit problem was proposed to simultaneously address
diversified retrieval and online learning in a recommender system. If there is
no uncertainty, this problem is equivalent to a submodular maximization problem
under a cardinality constraint. However, in some situations, recommendation
lists should satisfy additional constraints such as budget constraints, other
than a cardinality constraint. Thus, motivated by diversified retrieval
considering budget constraints, we introduce a submodular bandit problem under
the intersection of $l$ knapsacks and a $k$-system constraint. Here $k$-system
constraints form a very general class of constraints including cardinality
constraints and the intersection of $k$ matroid constraints. To solve this
problem, we propose a non-greedy algorithm that adaptively focuses on a
standard or modified upper-confidence bound. We provide a high-probability
upper bound of an approximation regret, where the approximation ratio matches
that of a fast offline algorithm. Moreover, we perform experiments under
various combinations of constraints using a synthetic and two real-world
datasets and demonstrate that our proposed methods outperform the existing
baselines.
Related papers
- Fixed-Budget Differentially Private Best Arm Identification [62.36929749450298]
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints.
We derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$.
arXiv Detail & Related papers (2024-01-17T09:23:25Z) - Multi-Agent Bayesian Optimization with Coupled Black-Box and Affine
Constraints [21.38692458445459]
We study the problem of distributed multi-agent Bayesian optimization with both coupled black-box constraints and known affine constraints.
A primal-dual distributed algorithm is proposed that achieves similar regret/violation bounds as those in the single-agent case.
arXiv Detail & Related papers (2023-10-02T08:07:36Z) - Submodular Maximization under the Intersection of Matroid and Knapsack
Constraints [23.0838604893412]
We consider the problem of submodular operations under the intersection of two commonly used constraints, i.e., $k$matroid constraint and $m$-knapsack constraint.
We prove that SPROUTOUT can achieve a SPR-time approximation guarantee better than incorporating state-of-the-art algorithms.
Experiments on the applications of movie recommendation and weighted max-cut demonstrate the superiority of SPROUT++ in practice.
arXiv Detail & Related papers (2023-07-18T02:37:14Z) - A Framework for Adapting Offline Algorithms to Solve Combinatorial
Multi-Armed Bandit Problems with Bandit Feedback [27.192028744078282]
We provide a framework for adapting discrete offline approximation algorithms into sublinear $alpha$-regret methods.
The proposed framework is applied to diverse applications in submodular horizon.
arXiv Detail & Related papers (2023-01-30T23:18:06Z) - A Unifying Framework for Online Optimization with Long-Term Constraints [62.35194099438855]
We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints.
The goal is to maximize their total reward, while at the same time achieving small cumulative violation across the $T$ rounds.
We present the first best-of-both-world type algorithm for this general class problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown model, and in the case in which they are selected at each round by an adversary.
arXiv Detail & Related papers (2022-09-15T16:59:19Z) - Algorithm for Constrained Markov Decision Process with Linear
Convergence [55.41644538483948]
An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs.
A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual.
The proposed approach is shown to converge (with linear rate) to the global optimum.
arXiv Detail & Related papers (2022-06-03T16:26:38Z) - Concave Utility Reinforcement Learning with Zero-Constraint Violations [43.29210413964558]
We consider the problem of concave utility reinforcement learning (CURL) with convex constraints.
We propose a model-based learning algorithm that also achieves zero constraint violations.
arXiv Detail & Related papers (2021-09-12T06:13:33Z) - On Lower Bounds for Standard and Robust Gaussian Process Bandit
Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm.
We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z) - Conditional gradient methods for stochastically constrained convex
minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems.
The most important feature of our framework is that only a subset of the constraints is processed at each iteration.
Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z) - An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data.
In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem.
We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.