Related papers: Submodular Bandit Problem Under Multiple Constraints

Submodular Bandit Problem Under Multiple Constraints

URL: http://arxiv.org/abs/2006.00661v5
Date: Mon, 29 Mar 2021 02:02:19 GMT
Title: Submodular Bandit Problem Under Multiple Constraints
Authors: Sho Takemori, Masahiro Sato, Takashi Sonoda, Janmajay Singh, Tomoko Ohkuma
Abstract summary: We introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint. To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound. We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast algorithm.
Score: 8.100450025624443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The linear submodular bandit problem was proposed to simultaneously address diversified retrieval and online learning in a recommender system. If there is no uncertainty, this problem is equivalent to a submodular maximization problem under a cardinality constraint. However, in some situations, recommendation lists should satisfy additional constraints such as budget constraints, other than a cardinality constraint. Thus, motivated by diversified retrieval considering budget constraints, we introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint. Here $k$-system constraints form a very general class of constraints including cardinality constraints and the intersection of $k$ matroid constraints. To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound. We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast offline algorithm. Moreover, we perform experiments under various combinations of constraints using a synthetic and two real-world datasets and demonstrate that our proposed methods outperform the existing baselines.

Related papers

Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex. Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs. We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z)
Bi-Criteria Optimization for Combinatorial Bandits: Sublinear Regret and Constraint Violation under Bandit Feedback [27.613888121859393]
We study bi-criteria optimization for multi-armed bandits (CMAB) with bandit feedback. We propose a general framework that transforms discrete bi-linear offline approximation algorithms into online algorithms with sublinear regret and cumulative constraint violation guarantees. These applications highlight the framework's broad utility in adapting offline guarantees to online bi-criteria optimization under bandit feedback.
arXiv Detail & Related papers (2025-03-15T22:52:27Z)
A single-loop SPIDER-type stochastic subgradient method for expectation-constrained nonconvex nonsmooth optimization [17.25924791071807]
We present a novel type of subgradient algorithm for complex constraints. We show that our method is significantly faster than two-of-the-art algorithms.
arXiv Detail & Related papers (2025-01-31T15:18:52Z)
Stochastic $k$-Submodular Bandits with Full Bandit Feedback [29.705337940879705]
We present the first sublinear $alpha$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback. A key contribution of our work is analyzing the robustness of the algorithms.
arXiv Detail & Related papers (2024-12-14T05:02:53Z)
Fixed-Budget Differentially Private Best Arm Identification [62.36929749450298]
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints. We derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$.
arXiv Detail & Related papers (2024-01-17T09:23:25Z)
Multi-Agent Bayesian Optimization with Coupled Black-Box and Affine Constraints [21.38692458445459]
We study the problem of distributed multi-agent Bayesian optimization with both coupled black-box constraints and known affine constraints. A primal-dual distributed algorithm is proposed that achieves similar regret/violation bounds as those in the single-agent case.
arXiv Detail & Related papers (2023-10-02T08:07:36Z)
Submodular Maximization under the Intersection of Matroid and Knapsack Constraints [23.0838604893412]
We consider the problem of submodular operations under the intersection of two commonly used constraints, i.e., $k$matroid constraint and $m$-knapsack constraint. We prove that SPROUTOUT can achieve a SPR-time approximation guarantee better than incorporating state-of-the-art algorithms. Experiments on the applications of movie recommendation and weighted max-cut demonstrate the superiority of SPROUT++ in practice.
arXiv Detail & Related papers (2023-07-18T02:37:14Z)
A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback [27.192028744078282]
We provide a framework for adapting discrete offline approximation algorithms into sublinear $alpha$-regret methods. The proposed framework is applied to diverse applications in submodular horizon.
arXiv Detail & Related papers (2023-01-30T23:18:06Z)
A Unifying Framework for Online Optimization with Long-Term Constraints [62.35194099438855]
We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints. The goal is to maximize their total reward, while at the same time achieving small cumulative violation across the $T$ rounds. We present the first best-of-both-world type algorithm for this general class problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown model, and in the case in which they are selected at each round by an adversary.
arXiv Detail & Related papers (2022-09-15T16:59:19Z)
Algorithm for Constrained Markov Decision Process with Linear Convergence [55.41644538483948]
An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs. A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual. The proposed approach is shown to converge (with linear rate) to the global optimum.
arXiv Detail & Related papers (2022-06-03T16:26:38Z)
Concave Utility Reinforcement Learning with Zero-Constraint Violations [43.29210413964558]
We consider the problem of concave utility reinforcement learning (CURL) with convex constraints. We propose a model-based learning algorithm that also achieves zero constraint violations.
arXiv Detail & Related papers (2021-09-12T06:13:33Z)
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm. We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z)
Conditional gradient methods for stochastically constrained convex minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems. The most important feature of our framework is that only a subset of the constraints is processed at each iteration. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data. In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem. We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.