fairml: A Statistician's Take on Fair Machine Learning Modelling
- URL: http://arxiv.org/abs/2305.02009v1
- Date: Wed, 3 May 2023 09:59:53 GMT
- Title: fairml: A Statistician's Take on Fair Machine Learning Modelling
- Authors: Marco Scutari
- Abstract summary: We describe the fairml package which implements our previous work (Scutari, Panero, and Proissl 2022) and related models in the literature.
fairml is designed around classical statistical models and penalised regression results.
The constraint used to enforce fairness is to model estimation, making it possible to mix-and-match the desired model family and fairness definition for each application.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The adoption of machine learning in applications where it is crucial to
ensure fairness and accountability has led to a large number of model proposals
in the literature, largely formulated as optimisation problems with constraints
reducing or eliminating the effect of sensitive attributes on the response.
While this approach is very flexible from a theoretical perspective, the
resulting models are somewhat black-box in nature: very little can be said
about their statistical properties, what are the best practices in their
applied use, and how they can be extended to problems other than those they
were originally designed for. Furthermore, the estimation of each model
requires a bespoke implementation involving an appropriate solver which is less
than desirable from a software engineering perspective.
In this paper, we describe the fairml R package which implements our previous
work (Scutari, Panero, and Proissl 2022) and related models in the literature.
fairml is designed around classical statistical models (generalised linear
models) and penalised regression results (ridge regression) to produce fair
models that are interpretable and whose properties are well-known. The
constraint used to enforce fairness is orthogonal to model estimation, making
it possible to mix-and-match the desired model family and fairness definition
for each application. Furthermore, fairml provides facilities for model
estimation, model selection and validation including diagnostic plots.
Related papers
- Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - Fair Multivariate Adaptive Regression Splines for Ensuring Equity and
Transparency [1.124958340749622]
We propose a fair predictive model based on MARS that incorporates fairness measures in the learning process.
MARS is a non-parametric regression model that performs feature selection, handles non-linear relationships, generates interpretable decision rules, and derives optimal splitting criteria on the variables.
We apply our fairMARS model to real-world data and demonstrate its effectiveness in terms of accuracy and equity.
arXiv Detail & Related papers (2024-02-23T19:02:24Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - A non-asymptotic penalization criterion for model selection in mixture
of experts models [1.491109220586182]
We consider the Gaussian-gated localized MoE (GLoME) regression model for modeling heterogeneous data.
This model poses challenging questions with respect to the statistical estimation and model selection problems.
We study the problem of estimating the number of components of the GLoME model, in a penalized maximum likelihood estimation framework.
arXiv Detail & Related papers (2021-04-06T16:24:55Z) - fairmodels: A Flexible Tool For Bias Detection, Visualization, And
Mitigation [3.548416925804316]
This article introduces an R package fairmodels that helps to validate fairness and eliminate bias in classification models.
The implemented set of functions and fairness metrics enables model fairness validation from different perspectives.
The package includes a series of methods for bias mitigation that aim to diminish the discrimination in the model.
arXiv Detail & Related papers (2021-04-01T15:06:13Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - On Statistical Efficiency in Learning [37.08000833961712]
We address the challenge of model selection to strike a balance between model fitting and model complexity.
We propose an online algorithm that sequentially expands the model complexity to enhance selection stability and reduce cost.
Experimental studies show that the proposed method has desirable predictive power and significantly less computational cost than some popular methods.
arXiv Detail & Related papers (2020-12-24T16:08:29Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.