Related papers: Online Learning under Budget and ROI Constraints via Weak Adaptivity

Online Learning under Budget and ROI Constraints via Weak Adaptivity

URL: http://arxiv.org/abs/2302.01203v3
Date: Sat, 2 Mar 2024 17:26:22 GMT
Title: Online Learning under Budget and ROI Constraints via Weak Adaptivity
Authors: Matteo Castiglioni, Andrea Celli, Christian Kroer
Abstract summary: Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions. We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
Score: 57.097119428915796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must know beforehand the value of parameters related to the degree of strict feasibility of the problem (i.e. Slater parameters). Second, a strictly feasible solution to the offline optimization problem must exist at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. In this paper, we show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. This results in a ``dual-balancing'' framework which ensures that dual variables stay sufficiently small, even in the absence of knowledge about Slater's parameter. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions, under stochastic and adversarial inputs. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions.

Related papers

Why Ask One When You Can Ask $k$? Two-Stage Learning-to-Defer to the Top-$k$ Experts [3.6787328174619254]
Learning-to-Defer (L2D) enables decision-making systems to improve reliability by selectively deferring uncertain predictions to more competent agents. We propose Top-$k$ Learning-to-Defer, a generalization of the classical two-stage L2D framework that allocates each query to the $k$ most confident agents instead of a single one. To further enhance flexibility and cost-efficiency, we introduce Top-$k(x)$ Learning-to-Defer, an adaptive extension that learns the optimal number of agents to consult for each query.
arXiv Detail & Related papers (2025-04-17T14:50:40Z)
Learning While Repositioning in On-Demand Vehicle Sharing Networks [4.724825031148413]
We consider a network inventory problem motivated by one-way, on-demand vehicle sharing services. We show that a natural Lipschitz-bandit approach achieves a regret guarantee of $widetildeO(Tfracnn+1)$, which suffers from the exponential dependence on $n$. Motivated by these challenges, we propose an Online Gradient Repositioning algorithm that relies solely on censored demand.
arXiv Detail & Related papers (2025-01-31T15:16:02Z)
Deep Generative Demand Learning for Newsvendor and Pricing [7.594251468240168]
We consider data-driven inventory and pricing decisions in the feature-based newsvendor problem. We propose a novel approach leveraging conditional deep generative models (cDGMs) to address these challenges. We provide theoretical guarantees for our approach, including the consistency of profit estimation and convergence of our decisions to the optimal solution.
arXiv Detail & Related papers (2024-11-13T14:17:26Z)
Selling Joint Ads: A Regret Minimization Perspective [7.288063443108292]
Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand) This problem captures, for example, situations where a merchant and a brand bid cooperatively in an auction to advertise a product, and both benefit from the ad being shown. A mechanism collects bids from the two and decides whether to allocate and which payments the two parties should make.
arXiv Detail & Related papers (2024-09-12T07:59:10Z)
A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers. Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective. We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z)
Near-Optimal Solutions of Constrained Learning Problems [85.48853063302764]
In machine learning systems, the need to curtail their behavior has become increasingly apparent. This is evidenced by recent advancements towards developing models that satisfy dual robustness variables. Our results show that rich parametrizations effectively mitigate non-dimensional, finite learning problems.
arXiv Detail & Related papers (2024-03-18T14:55:45Z)
Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$. Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z)
A Unifying Framework for Online Optimization with Long-Term Constraints [62.35194099438855]
We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints. The goal is to maximize their total reward, while at the same time achieving small cumulative violation across the $T$ rounds. We present the first best-of-both-world type algorithm for this general class problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown model, and in the case in which they are selected at each round by an adversary.
arXiv Detail & Related papers (2022-09-15T16:59:19Z)
Online Learning with Knapsacks: the Best of Both Worlds [54.28273783164608]
We casting online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$m resource constraints. Our framework allows the decision maker to handle its evidence flexibility and costoretic functions.
arXiv Detail & Related papers (2022-02-28T12:10:48Z)
Online Allocation with Two-sided Resource Constraints [44.5635910908944]
We consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially. We propose a new algorithm that obtains $1-O(fracepsilonalpha-epsilon)$ -competitive ratio for the offline problems that know the entire requests ahead of time.
arXiv Detail & Related papers (2021-12-28T02:21:06Z)
Online Optimization and Ambiguity-based Learning of Distributionally Uncertain Dynamic Systems [1.6709415233613623]
This paper proposes a novel approach to construct data-driven online solutions to optimization problems (P) subject to a class of distributionally uncertain dynamical systems. The introduced framework allows for the simultaneous learning of distributional system uncertainty via a parameterized, control-dependent ambiguity set. We also introduce an online version of Nesterov's accelerated-gradient algorithm, and analyze its performance to solve this class of problems via dissipativity theory.
arXiv Detail & Related papers (2021-02-18T01:49:06Z)
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems. Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.