Keep Everyone Happy: Online Fair Division of Numerous Items with Few Copies
- URL: http://arxiv.org/abs/2408.12845v2
- Date: Thu, 29 May 2025 17:56:45 GMT
- Title: Keep Everyone Happy: Online Fair Division of Numerous Items with Few Copies
- Authors: Arun Verma, Indrajit Saha, Makoto Yokoo, Bryan Kian Hsiang Low,
- Abstract summary: We consider a novel variant of the online fair division problem involving multiple agents in which a learner sequentially observes an indivisible item.<n>Existing algorithms assume a small number of items with a sufficiently large number of copies, which ensures a good utility estimation for all item-agent pairs.<n>We propose algorithms that model online fair division as a contextual bandit problem, with sub-linear regret guarantees.
- Score: 41.57721032039409
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper considers a novel variant of the online fair division problem involving multiple agents in which a learner sequentially observes an indivisible item that has to be irrevocably allocated to one of the agents while satisfying a fairness and efficiency constraint. Existing algorithms assume a small number of items with a sufficiently large number of copies, which ensures a good utility estimation for all item-agent pairs from noisy bandit feedback. However, this assumption may not hold in many real-life applications, for example, an online platform that has a large number of users (items) who use the platform's service providers (agents) only a few times (a few copies of items), which makes it difficult to accurately estimate utilities for all item-agent pairs. To address this, we assume utility is an unknown function of item-agent features. We then propose algorithms that model online fair division as a contextual bandit problem, with sub-linear regret guarantees. Our experimental results further validate the effectiveness of the proposed algorithms.
Related papers
- COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents [41.57721032039409]
Existing work in contextual bandits assumes that agents truthfully report their arms, which is unrealistic in many real-life applications.<n>We propose an algorithm, COBRA, for contextual bandit problems involving strategic agents that disincentivize their strategic behavior without using any monetary incentives.
arXiv Detail & Related papers (2025-05-29T17:53:12Z) - Multi Agent Reinforcement Learning for Sequential Satellite Assignment Problems [5.896440476510869]
Assignment problems are a classic optimization problem in which a group of agents is assigned to a group of tasks.
In many modern-day applications such as satellite, power grids, and mobile robot scheduling, assignment problems unfold over time.
We apply multi-agent reinforcement learning to this problem, learning the value of assignments by bootstrapping from a known RL-time greedy solver.
We demonstrate that our algorithm is theoretically justified and avoids pitfalls experienced by other algorithms in this setting.
arXiv Detail & Related papers (2024-12-20T05:10:34Z) - A Two-Stage Algorithm for Cost-Efficient Multi-instance Counterfactual Explanations [2.992602379681373]
We propose a flexible two-stage algorithm for finding groups of instances and computing cost-efficient multi-instance counterfactual explanations.
The paper presents the algorithm and its performance against popular alternatives through a comparative evaluation.
arXiv Detail & Related papers (2024-03-02T14:30:57Z) - Best Arm Identification in Batched Multi-armed Bandit Problems [3.908867017036043]
Recently multi-armed bandit problem arises in many real-life scenarios where arms must be sampled in batches.
We introduce a general linear programming framework that can incorporate objectives of different theoretical settings in best arm identification.
We demonstrate by numerical studies that the algorithm also has good performance compared to certain UCB-type or Thompson sampling methods.
arXiv Detail & Related papers (2023-12-21T14:16:38Z) - Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents [52.75161794035767]
We introduce a class of bandit algorithms that meet the two objectives of performance incentivization and robustness simultaneously.
We show that settings where the principal has no information about the arms' performance characteristics can be handled by combining ideas from second price auctions with our algorithms.
arXiv Detail & Related papers (2023-12-13T06:54:49Z) - Counterfactual Conservative Q Learning for Offline Multi-agent
Reinforcement Learning [54.788422270960496]
We propose a novel multi-agent offline RL algorithm, named CounterFactual Conservative Q-Learning (CFCQL)
CFCQL calculates conservative regularization for each agent separately in a counterfactual way and then linearly combines them to realize an overall conservative value estimation.
We prove that it still enjoys the underestimation property and the performance guarantee as those single agent conservative methods do, but the induced regularization and safe policy improvement bound are independent of the agent number.
arXiv Detail & Related papers (2023-09-22T08:10:25Z) - Multicopy Reinforcement Learning Agents [0.23090185577016445]
This paper examines a novel type of multi-agent problem, in which an agent makes multiple identical copies of itself in order to achieve a single agent task better or more efficiently.<n>We propose a learning algorithm for this multicopy problem which takes advantage of the structure of the value function to efficiently learn how to balance the advantages and costs of adding additional copies.
arXiv Detail & Related papers (2023-09-19T20:03:17Z) - A Dataset on Malicious Paper Bidding in Peer Review [84.68308372858755]
Malicious reviewers strategically bid in order to unethically manipulate the paper assignment.
A critical impediment towards creating and evaluating methods to mitigate this issue is the lack of publicly-available data on malicious paper bidding.
We release a novel dataset, collected from a mock conference activity where participants were instructed to bid either honestly or maliciously.
arXiv Detail & Related papers (2022-06-24T20:23:33Z) - Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure
of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs.
Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z) - Off-Policy Correction For Multi-Agent Reinforcement Learning [9.599347559588216]
Multi-agent reinforcement learning (MARL) provides a framework for problems involving multiple interacting agents.
Despite apparent similarity to the single-agent case, multi-agent problems are often harder to train and analyze theoretically.
We propose MA-Trace, a new on-policy actor-critic algorithm, which extends V-Trace to the MARL setting.
arXiv Detail & Related papers (2021-11-22T14:23:13Z) - On the Use and Misuse of Absorbing States in Multi-agent Reinforcement
Learning [55.95253619768565]
Current MARL algorithms assume that the number of agents within a group remains fixed throughout an experiment.
In many practical problems, an agent may terminate before their teammates.
We present a novel architecture for an existing state-of-the-art MARL algorithm which uses attention instead of a fully connected layer with absorbing states.
arXiv Detail & Related papers (2021-11-10T23:45:08Z) - PROPm Allocations of Indivisible Goods to Multiple Agents [3.361550072563245]
We study the classic problem of fairly allocating a set of indivisible goods among a group of agents, and focus on the notion of approximate proportionality known as PROPm.
We show that a PROPm allocation is guaranteed to exist for all instances, independent of the number of agents or goods.
arXiv Detail & Related papers (2021-05-24T15:34:33Z) - Multi-agent Policy Optimization with Approximatively Synchronous
Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly.
In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously.
In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z) - Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks [50.78037828213118]
This paper tackles the semi-supervised crowd counting problem from the perspective of feature learning.
We propose a novel semi-supervised crowd counting method which is built upon two innovative components.
arXiv Detail & Related papers (2020-07-07T05:30:53Z) - A SUPER* Algorithm to Optimize Paper Bidding in Peer Review [39.99497980352629]
We present an algorithm called SUPER*, inspired by the A* algorithm, for this goal.
Under a community model for the similarities, we prove that SUPER* is near-optimal whereas the popular baselines are considerably suboptimal.
In experiments on real data from ICLR 2018 and synthetic data, we find that SUPER* considerably outperforms baselines deployed in existing systems.
arXiv Detail & Related papers (2020-06-27T06:44:49Z) - Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic [54.2180984002807]
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
arXiv Detail & Related papers (2020-02-24T20:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.