Fairer LP-based Online Allocation
- URL: http://arxiv.org/abs/2110.14621v1
- Date: Wed, 27 Oct 2021 17:45:20 GMT
- Title: Fairer LP-based Online Allocation
- Authors: Guanting Chen, Xiaocheng Li, Yinyu Ye
- Abstract summary: We consider a Linear Program (LP)-based online resource allocation problem where a decision maker accepts or rejects incoming customer requests irrevocably.
We propose a fair algorithm that uses an interior-point LP solver and dynamically detects unfair resource spending.
Our approach do not formulate the fairness requirement as a constrain in the optimization instance, and instead we address the problem from the perspective of algorithm design.
- Score: 13.478067250930101
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider a Linear Program (LP)-based online resource
allocation problem where a decision maker accepts or rejects incoming customer
requests irrevocably in order to maximize expected revenue given limited
resources. At each time, a new order/customer/bid is revealed with a request of
some resource(s) and a reward. We consider a stochastic setting where all the
orders are i.i.d. sampled from an unknown distribution. Such formulation
contains many classic applications such as the canonical (quantity-based)
network revenue management problem and the Adwords problem. Specifically, we
study the objective of providing fairness guarantees while maintaining low
regret. Our definition of fairness is that a fair online algorithm should treat
similar agents/customers similarly and the decision made for similar
individuals should be consistent over time. We define a fair offline solution
as the analytic center of the offline optimal solution set, and propose a fair
algorithm that uses an interior-point LP solver and dynamically detects unfair
resource spending. Our algorithm can control cumulative unfairness (the
cumulative deviation from the online solutions to the offline fair solution) on
the scale of order $O(\log(T))$, while maintaining the regret to be bounded
with dependency on $T$. Our approach do not formulate the fairness requirement
as a constrain in the optimization instance, and instead we address the problem
from the perspective of algorithm design. We get the desirable fairness
guarantee without imposing any fairness constraint, and our regret result is
strong for the reason that we evaluate the regret by comparing to the original
objective value.
Related papers
- A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - Trading-off price for data quality to achieve fair online allocation [25.154957931903525]
We consider the problem of online allocation subject to a long-term fairness penalty.
We propose an algorithm that jointly solves both problems and show that it has a regret bounded by $mathcalO(sqrtT)$.
arXiv Detail & Related papers (2023-06-23T11:09:43Z) - Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
Learning: Adaptivity and Computational Efficiency [90.40062452292091]
We present the first computationally efficient algorithm for linear bandits with heteroscedastic noise.
Our algorithm is adaptive to the unknown variance of noise and achieves an $tildeO(d sqrtsum_k = 1K sigma_k2 + d)$ regret.
We also propose a variance-adaptive algorithm for linear mixture Markov decision processes (MDPs) in reinforcement learning.
arXiv Detail & Related papers (2023-02-21T00:17:24Z) - Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage [100.8180383245813]
We propose value-based algorithms for offline reinforcement learning (RL)
We show an analogous result for vanilla Q-functions under a soft margin condition.
Our algorithms' loss functions arise from casting the estimation problems as nonlinear convex optimization problems and Lagrangifying.
arXiv Detail & Related papers (2023-02-05T14:22:41Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Safe Online Bid Optimization with Return-On-Investment and Budget
Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems.
We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations.
More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z) - Online Allocation with Two-sided Resource Constraints [44.5635910908944]
We consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially.
We propose a new algorithm that obtains $1-O(fracepsilonalpha-epsilon)$ -competitive ratio for the offline problems that know the entire requests ahead of time.
arXiv Detail & Related papers (2021-12-28T02:21:06Z) - Algorithmic Challenges in Ensuring Fairness at the Time of Decision [6.228560624452748]
Algorithmic decision-making in societal contexts can be framed as optimization under bandit feedback.
Recent lawsuits accuse companies that deploy algorithmic pricing practices of price gouging.
We introduce the well-studied fairness notion of envy-freeness within the context of convex optimization.
arXiv Detail & Related papers (2021-03-16T19:06:28Z) - Online Apprenticeship Learning [58.45089581278177]
In Apprenticeship Learning (AL), we are given a Markov Decision Process (MDP) without access to the cost function.
The goal is to find a policy that matches the expert's performance on some predefined set of cost functions.
We show that the OAL problem can be effectively solved by combining two mirror descent based no-regret algorithms.
arXiv Detail & Related papers (2021-02-13T12:57:51Z) - The Best of Many Worlds: Dual Mirror Descent for Online Allocation
Problems [7.433931244705934]
We consider a data-driven setting in which the reward and resource consumption of each request are generated using an input model unknown to the decision maker.
We design general class of algorithms that attain good performance in various input models without knowing which type of input they are facing.
Our algorithms operate in the Lagrangian dual space: they maintain a dual multiplier for each resource that is updated using online mirror descent.
arXiv Detail & Related papers (2020-11-18T18:39:17Z) - Regularized Online Allocation Problems: Fairness and Beyond [7.433931244705934]
We introduce the emphregularized online allocation problem, a variant that includes a non-linear regularizer acting on the total resource consumption.
In this problem, requests repeatedly arrive over time and, for each request, a decision maker needs to take an action that generates a reward and consumes resources.
The objective is to simultaneously maximize additively separable rewards and the value of a non-separable regularizer subject to the resource constraints.
arXiv Detail & Related papers (2020-07-01T14:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.