Online Allocation with Two-sided Resource Constraints
- URL: http://arxiv.org/abs/2112.13964v1
- Date: Tue, 28 Dec 2021 02:21:06 GMT
- Title: Online Allocation with Two-sided Resource Constraints
- Authors: Qixin Zhang, Wenbing Ye, Zaiyi Chen, Haoyuan Hu, Enhong Chen, Yang Yu
- Abstract summary: We consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially.
We propose a new algorithm that obtains $1-O(fracepsilonalpha-epsilon)$ -competitive ratio for the offline problems that know the entire requests ahead of time.
- Score: 44.5635910908944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by many interesting real-world applications in logistics and online
advertising, we consider an online allocation problem subject to lower and
upper resource constraints, where the requests arrive sequentially, sampled
i.i.d. from an unknown distribution, and we need to promptly make a decision
given limited resources and lower bounds requirements. First, with knowledge of
the measure of feasibility, i.e., $\alpha$, we propose a new algorithm that
obtains $1-O(\frac{\epsilon}{\alpha-\epsilon})$ -competitive ratio for the
offline problems that know the entire requests ahead of time. Inspired by the
previous studies, this algorithm adopts an innovative technique to dynamically
update a threshold price vector for making decisions. Moreover, an optimization
method to estimate the optimal measure of feasibility is proposed with
theoretical guarantee at the end of this paper. Based on this method, if we
tolerate slight violation of the lower bounds constraints with parameter
$\eta$, the proposed algorithm is naturally extended to the settings without
strong feasible assumption, which cover the significantly unexplored infeasible
scenarios.
Related papers
- A Bandit Approach to Online Pricing for Heterogeneous Edge Resource
Allocation [8.089950414444115]
Two novel online pricing mechanisms are proposed for heterogeneous edge resource allocation.
The mechanisms operate in real-time and do not require prior knowledge of demand distribution.
The proposed posted pricing schemes allow users to select and pay for their preferred resources, with the platform dynamically adjusting resource prices based on observed historical data.
arXiv Detail & Related papers (2023-02-14T10:21:14Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Stochastic Direct Search Method for Blind Resource Allocation [6.574808513848414]
We study direct search (also known as pattern search) methods for linearly constrained and derivative-free optimization.
We show that direct search methods achieves finite regret in the deterministic and unconstrained case.
We propose a simple extension of direct search that achieves a regret upper-bound of the order of $T2/3$.
arXiv Detail & Related papers (2022-10-11T07:40:45Z) - Online Contextual Decision-Making with a Smart Predict-then-Optimize
Method [4.061135251278187]
We study an online contextual decision-making problem with resource constraints.
We propose an algorithm that mixes a prediction step based on the "Smart Predict-then- (SPO)" method with a dual update step based on mirror descent.
We prove regret bounds and demonstrate that the overall convergence rate of our method depends on the $mathcalO(T-1/2)$ convergence of online mirror descent.
arXiv Detail & Related papers (2022-06-15T06:16:13Z) - Online Nonsubmodular Minimization with Delayed Costs: From Full
Information to Bandit Feedback [98.7678704343537]
We focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms.
We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded.
arXiv Detail & Related papers (2022-05-15T08:27:12Z) - Safe Online Bid Optimization with Return-On-Investment and Budget
Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems.
We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations.
More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z) - Regularized Online Allocation Problems: Fairness and Beyond [7.433931244705934]
We introduce the emphregularized online allocation problem, a variant that includes a non-linear regularizer acting on the total resource consumption.
In this problem, requests repeatedly arrive over time and, for each request, a decision maker needs to take an action that generates a reward and consumes resources.
The objective is to simultaneously maximize additively separable rewards and the value of a non-separable regularizer subject to the resource constraints.
arXiv Detail & Related papers (2020-07-01T14:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.