Learning Resource Allocation Policies from Observational Data with an
Application to Homeless Services Delivery
- URL: http://arxiv.org/abs/2201.10053v2
- Date: Fri, 3 Jun 2022 20:37:25 GMT
- Title: Learning Resource Allocation Policies from Observational Data with an
Application to Homeless Services Delivery
- Authors: Aida Rahmattalabi, Phebe Vayanos, Kathryn Dullerud, Eric Rice
- Abstract summary: We study the problem of learning, from observational data, fair and interpretable policies that effectively match heterogeneous individuals to scarce resources of different types.
We conduct extensive analyses using synthetic and real-world data.
- Score: 9.65131987576314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning, from observational data, fair and
interpretable policies that effectively match heterogeneous individuals to
scarce resources of different types. We model this problem as a multi-class
multi-server queuing system where both individuals and resources arrive
stochastically over time. Each individual, upon arrival, is assigned to a queue
where they wait to be matched to a resource. The resources are assigned in a
first come first served (FCFS) fashion according to an eligibility structure
that encodes the resource types that serve each queue. We propose a methodology
based on techniques in modern causal inference to construct the individual
queues as well as learn the matching outcomes and provide a mixed-integer
optimization (MIO) formulation to optimize the eligibility structure. The MIO
problem maximizes policy outcome subject to wait time and fairness constraints.
It is very flexible, allowing for additional linear domain constraints. We
conduct extensive analyses using synthetic and real-world data. In particular,
we evaluate our framework using data from the U.S. Homeless Management
Information System (HMIS). We obtain wait times as low as an FCFS policy while
improving the rate of exit from homelessness for underserved or vulnerable
groups (7% higher for the Black individuals and 15% higher for those below 17
years old) and overall.
Related papers
- Dynamic Matching with Post-allocation Service and its Application to Refugee Resettlement [1.9689888982532262]
Motivated by our collaboration with a major refugee resettlement agency in the U.S., we study a dynamic matching problem where each new arrival (a refugee case) must be matched immediately and irrevocably to one of the static resources (a location with a fixed annual quota)
Given the time-consuming nature of service, a server may not be available at a given time, thus we refer to it as a dynamic resource. Upon matching, the case will wait to avail service in a first-come-first-serve manner.
arXiv Detail & Related papers (2024-10-30T13:17:38Z) - Active Learning for Fair and Stable Online Allocations [6.23798328186465]
We consider feedback from a select subset of agents at each epoch of the online resource allocation process.
Our algorithms provide regret bounds that are sub-linear in number of time-periods for various measures.
We show that efficient decision-making does not require extensive feedback and produces efficient outcomes for a variety of problem classes.
arXiv Detail & Related papers (2024-06-20T23:23:23Z) - A Resource-Adaptive Approach for Federated Learning under Resource-Constrained Environments [22.038826059430242]
The paper studies a fundamental federated learning (FL) problem involving multiple clients with heterogeneous constrained resources.
We propose Fed-RAA: a Resource-Adaptive Asynchronous Federated learning algorithm.
arXiv Detail & Related papers (2024-06-19T08:55:40Z) - Fine-Tuning Language Models with Reward Learning on Policy [68.70065254564642]
Reinforcement learning from human feedback (RLHF) has emerged as an effective approach to aligning large language models (LLMs) to human preferences.
Despite its popularity, (fixed) reward models may suffer from inaccurate off-distribution.
We propose reward learning on policy (RLP), an unsupervised framework that refines a reward model using policy samples to keep it on-distribution.
arXiv Detail & Related papers (2024-03-28T10:02:10Z) - Learning Optimal and Fair Policies for Online Allocation of Scarce
Societal Resources from Data Collected in Deployment [5.0904557821667]
We use administrative data collected in deployment to design an online policy that maximizes expected outcomes while satisfying budget constraints.
We show that using our policies improves rates of exit from homelessness by 1.9% and that policies that are fair in either allocation or outcomes by race come at a very low price of fairness.
arXiv Detail & Related papers (2023-11-23T01:40:41Z) - Improving Generalization of Alignment with Human Preferences through
Group Invariant Learning [56.19242260613749]
Reinforcement Learning from Human Feedback (RLHF) enables the generation of responses more aligned with human preferences.
Previous work shows that Reinforcement Learning (RL) often exploits shortcuts to attain high rewards and overlooks challenging samples.
We propose a novel approach that can learn a consistent policy via RL across various data groups or domains.
arXiv Detail & Related papers (2023-10-18T13:54:15Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Optimal Resource Allocation for Serverless Queries [8.59568779761598]
Prior work focused on predicting peak allocation while ignoring aggressive trade-offs between resource allocation and run-time.
We introduce a system for optimal resource allocation that can predict performance with aggressive trade-offs, for both new and past observed queries.
arXiv Detail & Related papers (2021-07-19T02:55:48Z) - MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch
Optimization for Deployment Constrained Reinforcement Learning [108.79676336281211]
Continuous deployment of new policies for data collection and online learning is either cost ineffective or impractical.
We propose a new algorithmic learning framework called Model-based Uncertainty regularized and Sample Efficient Batch Optimization.
Our framework discovers novel and high quality samples for each deployment to enable efficient data collection.
arXiv Detail & Related papers (2021-02-23T01:30:55Z) - Online Learning Demands in Max-min Fairness [91.37280766977923]
We describe mechanisms for the allocation of a scarce resource among multiple users in a way that is efficient, fair, and strategy-proof.
The mechanism is repeated for multiple rounds and a user's requirements can change on each round.
At the end of each round, users provide feedback about the allocation they received, enabling the mechanism to learn user preferences over time.
arXiv Detail & Related papers (2020-12-15T22:15:20Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.