Designing Long-term Group Fair Policies in Dynamical Systems
- URL: http://arxiv.org/abs/2311.12447v1
- Date: Tue, 21 Nov 2023 08:58:50 GMT
- Title: Designing Long-term Group Fair Policies in Dynamical Systems
- Authors: Miriam Rateike, Isabel Valera and Patrick Forr\'e
- Abstract summary: We propose a novel framework for achieving long-term group fairness in dynamical systems.
Our framework allows us to identify a time-independent policy that converges, if deployed, to the targeted fair stationary state of the system in the long term.
- Score: 12.115106776644156
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neglecting the effect that decisions have on individuals (and thus, on the
underlying data distribution) when designing algorithmic decision-making
policies may increase inequalities and unfairness in the long term - even if
fairness considerations were taken in the policy design process. In this paper,
we propose a novel framework for achieving long-term group fairness in
dynamical systems, in which current decisions may affect an individual's
features in the next step, and thus, future decisions. Specifically, our
framework allows us to identify a time-independent policy that converges, if
deployed, to the targeted fair stationary state of the system in the long term,
independently of the initial data distribution. We model the system dynamics
with a time-homogeneous Markov chain and optimize the policy leveraging the
Markov chain convergence theorem to ensure unique convergence. We provide
examples of different targeted fair states of the system, encompassing a range
of long-term goals for society and policymakers. Furthermore, we show how our
approach facilitates the evaluation of different long-term targets by examining
their impact on the group-conditional population distribution in the long term
and how it evolves until convergence.
Related papers
- Consensus in Motion: A Case of Dynamic Rationality of Sequential Learning in Probability Aggregation [0.562479170374811]
We propose a framework for probability aggregation based on propositional probability logic.
We show that any consensus-compatible and independent aggregation rule on a non-nested agenda is necessarily linear.
arXiv Detail & Related papers (2025-04-20T14:04:39Z) - Long-Term Fairness in Sequential Multi-Agent Selection with Positive Reinforcement [21.44063458579184]
In selection processes such as college admissions or hiring, biasing slightly towards applicants from under-represented groups is hypothesized to provide positive feedback.
We propose the Multi-agent Fair-Greedy policy, which balances greedy score and fairness.
Our results indicate that, while positive reinforcement is a promising mechanism for long-term fairness, policies must be designed carefully to be robust to variations in the evolution model.
arXiv Detail & Related papers (2024-07-10T04:03:23Z) - Off-Policy Evaluation for Large Action Spaces via Policy Convolution [60.6953713877886]
Policy Convolution family of estimators uses latent structure within actions to strategically convolve the logging and target policies.
Experiments on synthetic and benchmark datasets demonstrate remarkable mean squared error (MSE) improvements when using PC.
arXiv Detail & Related papers (2023-10-24T01:00:01Z) - Policy Dispersion in Non-Markovian Environment [53.05904889617441]
This paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment.
We first adopt a transformer-based method to learn policy embeddings.
Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies.
arXiv Detail & Related papers (2023-02-28T11:58:39Z) - Tier Balancing: Towards Dynamic Fairness over Underlying Causal Factors [11.07759054787023]
The pursuit of long-term fairness involves the interplay between decision-making and the underlying data generating process.
We propose Tier Balancing, a technically more challenging but more natural notion to achieve.
Under the specified dynamics, we prove that in general one cannot achieve the long-term fairness goal only through one-step interventions.
arXiv Detail & Related papers (2023-01-21T18:05:59Z) - Policy Optimization with Advantage Regularization for Long-Term Fairness
in Decision Systems [14.095401339355677]
Long-term fairness is an important factor of consideration in designing and deploying learning-based decision systems.
Recent work has proposed the use of Markov Decision Processes (MDPs) to formulate decision-making with long-term fairness requirements.
We show that policy optimization methods from deep reinforcement learning can be used to find strictly better decision policies.
arXiv Detail & Related papers (2022-10-22T20:41:36Z) - Towards Return Parity in Markov Decision Processes [36.96748490812215]
We study a fairness problem in Markov decision processes (MDPs)
We propose return parity, a fairness notion that requires MDPs from different demographic groups to achieve the same expected rewards.
Motivated by our decomposition theorem, we propose algorithms to mitigate return disparity via learning a shared group policy with state visitation distributional alignment.
arXiv Detail & Related papers (2021-11-19T23:25:38Z) - Fair Incentives for Repeated Engagement [0.46040036610482665]
We study a problem of finding optimal monetary incentive schemes for retention when faced with agents whose participation decisions depend on the incentive they receive.
We show that even in the absence of explicit discrimination, policies may unintentionally discriminate between agents of different types by varying the type composition of the system.
arXiv Detail & Related papers (2021-10-28T04:13:53Z) - On the Sample Complexity and Metastability of Heavy-tailed Policy Search
in Continuous Control [47.71156648737803]
Reinforcement learning is a framework for interactive decision-making with incentives sequentially revealed across time without a system dynamics model.
We characterize a defined defined chain, identifying that policies associated with Levy Processes of a tail index yield to wider peaks.
arXiv Detail & Related papers (2021-06-15T20:12:44Z) - Offline Policy Selection under Uncertainty [113.57441913299868]
We consider offline policy selection as learning preferences over a set of policy prospects given a fixed experience dataset.
Access to the full distribution over one's belief of the policy value enables more flexible selection algorithms under a wider range of downstream evaluation metrics.
We show how BayesDICE may be used to rank policies with respect to any arbitrary downstream policy selection metric.
arXiv Detail & Related papers (2020-12-12T23:09:21Z) - Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy.
We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z) - Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
Policies [80.42316902296832]
We study the estimation of policy value and gradient of a deterministic policy from off-policy data when actions are continuous.
In this setting, standard importance sampling and doubly robust estimators for policy value and gradient fail because the density ratio does not exist.
We propose several new doubly robust estimators based on different kernelization approaches.
arXiv Detail & Related papers (2020-06-06T15:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.