Fair Set Selection: Meritocracy and Social Welfare
- URL: http://arxiv.org/abs/2102.11932v1
- Date: Tue, 23 Feb 2021 20:36:36 GMT
- Title: Fair Set Selection: Meritocracy and Social Welfare
- Authors: Thomas Kleine Buening and Meirav Segal and Debabrota Basu and Christos
Dimitrakakis
- Abstract summary: We formulate the problem of selecting a set of individuals from a candidate population as a utility maximisation problem.
From the decision maker's perspective, it is equivalent to finding a selection policy that maximises expected utility.
Our framework leads to the notion of expected marginal contribution (EMC) of an individual with respect to a selection policy as a measure of deviation from meritocracy.
- Score: 6.205308371824033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we formulate the problem of selecting a set of individuals
from a candidate population as a utility maximisation problem. From the
decision maker's perspective, it is equivalent to finding a selection policy
that maximises expected utility. Our framework leads to the notion of expected
marginal contribution (EMC) of an individual with respect to a selection policy
as a measure of deviation from meritocracy. In order to solve the maximisation
problem, we propose to use a policy gradient algorithm. For certain policy
structures, the policy gradients are proportional to EMCs of individuals.
Consequently, the policy gradient algorithm leads to a locally optimal solution
that has zero EMC, and satisfies meritocracy. For uniform policies, EMC reduces
to the Shapley value. EMC also generalises the fair selection properties of
Shapley value for general selection policies. We experimentally analyse the
effect of different policy structures in a simulated college admission setting
and compare with ranking and greedy algorithms. Our results verify that
separable linear policies achieve high utility while minimising EMCs. We also
show that we can design utility functions that successfully promote notions of
group fairness, such as diversity.
Related papers
- Policy Aggregation [21.21314301021803]
We consider the challenge of AI value alignment with multiple individuals with different reward functions and optimal policies in an underlying Markov decision process.
We formalize this problem as one of policy aggregation, where the goal is to identify a desirable collective policy.
Key insight is that social choice methods can be reinterpreted by identifying ordinal preferences with volumes of subsets of the state-action occupancy polytope.
arXiv Detail & Related papers (2024-11-06T04:19:50Z) - Personalized Reinforcement Learning with a Budget of Policies [9.846353643883443]
Personalization in machine learning (ML) tailors models' decisions to the individual characteristics of users.
We propose a novel framework termed represented Markov Decision Processes (r-MDPs) that is designed to balance the need for personalization with the regulatory constraints.
In an r-MDP, we cater to a diverse user population, each with unique preferences, through interaction with a small set of representative policies.
We develop two deep reinforcement learning algorithms that efficiently solve r-MDPs.
arXiv Detail & Related papers (2024-01-12T11:27:55Z) - Policy Dispersion in Non-Markovian Environment [53.05904889617441]
This paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment.
We first adopt a transformer-based method to learn policy embeddings.
Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies.
arXiv Detail & Related papers (2023-02-28T11:58:39Z) - Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality [94.89246810243053]
This paper studies offline policy learning, which aims at utilizing observations collected a priori to learn an optimal individualized decision rule.
Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics must be lower bounded.
We propose Pessimistic Policy Learning (PPL), a new algorithm that optimize lower confidence bounds (LCBs) instead of point estimates.
arXiv Detail & Related papers (2022-12-19T22:43:08Z) - Efficient Policy Iteration for Robust Markov Decision Processes via
Regularization [49.05403412954533]
Robust decision processes (MDPs) provide a framework to model decision problems where the system dynamics are changing or only partially known.
Recent work established the equivalence between texttts rectangular $L_p$ robust MDPs and regularized MDPs, and derived a regularized policy iteration scheme that enjoys the same level of efficiency as standard MDPs.
In this work, we focus on the policy improvement step and derive concrete forms for the greedy policy and the optimal robust Bellman operators.
arXiv Detail & Related papers (2022-05-28T04:05:20Z) - CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies [62.39667564455059]
We consider and study a distribution of optimal policies.
In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems.
We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability.
arXiv Detail & Related papers (2022-05-19T09:48:56Z) - Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response.
We construct unbiased estimators for the policy-dependent estimand by a perturbation method.
We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z) - Safe Policy Learning through Extrapolation: Application to Pre-trial
Risk Assessment [0.0]
We develop a robust optimization approach that partially identifies the expected utility of a policy, and then finds an optimal policy.
We extend this approach to common and important settings where humans make decisions with the aid of algorithmic recommendations.
We derive new classification and recommendation rules that retain the transparency and interpretability of the existing risk assessment instrument.
arXiv Detail & Related papers (2021-09-22T00:52:03Z) - Offline Policy Selection under Uncertainty [113.57441913299868]
We consider offline policy selection as learning preferences over a set of policy prospects given a fixed experience dataset.
Access to the full distribution over one's belief of the policy value enables more flexible selection algorithms under a wider range of downstream evaluation metrics.
We show how BayesDICE may be used to rank policies with respect to any arbitrary downstream policy selection metric.
arXiv Detail & Related papers (2020-12-12T23:09:21Z) - Robust Batch Policy Learning in Markov Decision Processes [0.0]
We study the offline data-driven sequential decision making problem in the framework of Markov decision process (MDP)
We propose to evaluate each policy by a set of the average rewards with respect to distributions centered at the policy induced stationary distribution.
arXiv Detail & Related papers (2020-11-09T04:41:21Z) - Optimal Policies for the Homogeneous Selective Labels Problem [19.54948759840131]
This paper reports work in progress on learning decision policies in the face of selective labels.
For maximizing discounted total reward, the optimal policy is shown to be a threshold policy.
For undiscounted infinite-horizon average reward, optimal policies have positive acceptance probability in all states.
arXiv Detail & Related papers (2020-11-02T23:32:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.