Group-Fair Online Allocation in Continuous Time
- URL: http://arxiv.org/abs/2006.06852v2
- Date: Thu, 23 Jul 2020 19:07:54 GMT
- Title: Group-Fair Online Allocation in Continuous Time
- Authors: Semih Cayci, Swati Gupta, Atilla Eryilmaz
- Abstract summary: We consider continuous-time online learning problem with fairness considerations.
We show that this formulation recovers reward-maximizing, max-min fair and proportionally fair allocation rules.
We propose a novel online learning algorithm based on dual ascent optimization for time averages.
- Score: 27.32936573198251
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The theory of discrete-time online learning has been successfully applied in
many problems that involve sequential decision-making under uncertainty.
However, in many applications including contractual hiring in online
freelancing platforms and server allocation in cloud computing systems, the
outcome of each action is observed only after a random and action-dependent
time. Furthermore, as a consequence of certain ethical and economic concerns,
the controller may impose deadlines on the completion of each task, and require
fairness across different groups in the allocation of total time budget $B$. In
order to address these applications, we consider continuous-time online
learning problem with fairness considerations, and present a novel framework
based on continuous-time utility maximization. We show that this formulation
recovers reward-maximizing, max-min fair and proportionally fair allocation
rules across different groups as special cases. We characterize the optimal
offline policy, which allocates the total time between different actions in an
optimally fair way (as defined by the utility function), and impose deadlines
to maximize time-efficiency. In the absence of any statistical knowledge, we
propose a novel online learning algorithm based on dual ascent optimization for
time averages, and prove that it achieves $\tilde{O}(B^{-1/2})$ regret bound.
Related papers
- Adversarial Network Optimization under Bandit Feedback: Maximizing Utility in Non-Stationary Multi-Hop Networks [35.78834550608041]
Classical SNO algorithms require network conditions to be stationary with time.
Motivated by these issues, we consider Adversarial Network Optimization (ANO) under bandit feedback.
Our proposed UMO2 algorithm ensures network stability and also matches the utility performance of any "mildly varying" reference policy.
arXiv Detail & Related papers (2024-08-29T02:18:28Z) - Learning to Cover: Online Learning and Optimization with Irreversible Decisions [50.5775508521174]
We find that regret grows sub-linearly at a rate $Thetaleft(mfrac12cdotfrac11-2-Tright)$, thus converging exponentially fast to $Theta(sqrtm)$.
These findings underscore the benefits of limited online learning and optimization, in that even a few rounds can provide significant benefits as compared to a no-learning baseline.
arXiv Detail & Related papers (2024-06-20T23:00:25Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - Learning to Schedule Online Tasks with Bandit Feedback [7.671139712158846]
Online task scheduling serves an integral role for task-intensive applications in cloud computing and crowdsourcing.
We propose a double-optimistic learning based Robbins-Monro (DOL-RM) algorithm.
DOL-RM integrates a learning module that incorporates optimistic estimation for reward-to-cost ratio and a decision module.
arXiv Detail & Related papers (2024-02-26T10:11:28Z) - Long-term Fairness For Real-time Decision Making: A Constrained Online
Optimization Approach [14.098628848491146]
We introduce a framework for ensuring long-term fairness within dynamic decision-making systems characterized by time-varying fairness constraints.
A novel online algorithm, named LoTFair, is presented that solves the problem 'on the fly'
arXiv Detail & Related papers (2024-01-04T21:55:50Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Optimal Admission Control for Multiclass Queues with Time-Varying
Arrival Rates via State Abstraction [16.99621896314678]
We consider a novel queuing problem where the decision-maker must choose to accept or reject randomly arriving tasks.
The objective is to decide which tasks to accept so that the total price of tasks processed is maximised over a finite horizon.
We show that the optimal value function has a specific structure, which enables us to solve the hybrid MDP exactly.
arXiv Detail & Related papers (2022-03-14T12:38:13Z) - MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch
Optimization for Deployment Constrained Reinforcement Learning [108.79676336281211]
Continuous deployment of new policies for data collection and online learning is either cost ineffective or impractical.
We propose a new algorithmic learning framework called Model-based Uncertainty regularized and Sample Efficient Batch Optimization.
Our framework discovers novel and high quality samples for each deployment to enable efficient data collection.
arXiv Detail & Related papers (2021-02-23T01:30:55Z) - Regularized Online Allocation Problems: Fairness and Beyond [7.433931244705934]
We introduce the emphregularized online allocation problem, a variant that includes a non-linear regularizer acting on the total resource consumption.
In this problem, requests repeatedly arrive over time and, for each request, a decision maker needs to take an action that generates a reward and consumes resources.
The objective is to simultaneously maximize additively separable rewards and the value of a non-separable regularizer subject to the resource constraints.
arXiv Detail & Related papers (2020-07-01T14:24:58Z) - Toward Optimal Adversarial Policies in the Multiplicative Learning
System with a Malicious Expert [87.12201611818698]
We consider a learning system that combines experts' advice to predict a sequence of true outcomes.
It is assumed that one of the experts is malicious and aims to impose the maximum loss on the system.
We show that a simple greedy policy of always reporting false prediction is optimal with an approximation ratio of $1+O(sqrtfracln NN)$.
For the online setting where the malicious expert can adaptively make its decisions, we show that the optimal online policy can be efficiently computed by solving a dynamic program in $O(N3)$.
arXiv Detail & Related papers (2020-01-02T18:04:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.