Related papers: Software Engineering Principles for Fairer Systems: Experiments with GroupCART

Software Engineering Principles for Fairer Systems: Experiments with GroupCART

URL: http://arxiv.org/abs/2504.12587v1
Date: Thu, 17 Apr 2025 02:06:05 GMT
Title: Software Engineering Principles for Fairer Systems: Experiments with GroupCART
Authors: Kewen Peng, Hao Zhuo, Yicheng Yang, Tim Menzies,
Abstract summary: GroupCART is a tree-based ensemble that avoids bias during model construction.<n>Our experiments show that GroupCART achieves fairer models without data transformation.<n>Results demonstrate that algorithmic bias in decision tree models can be mitigated through multi-task, fairness-aware learning.
Score: 9.545063195641882
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Discrimination-aware classification aims to make accurate predictions while satisfying fairness constraints. Traditional decision tree learners typically optimize for information gain in the target attribute alone, which can result in models that unfairly discriminate against protected social groups (e.g., gender, ethnicity). Motivated by these shortcomings, we propose GroupCART, a tree-based ensemble optimizer that avoids bias during model construction by optimizing not only for decreased entropy in the target attribute but also for increased entropy in protected attributes. Our experiments show that GroupCART achieves fairer models without data transformation and with minimal performance degradation. Furthermore, the method supports customizable weighting, offering a smooth and flexible trade-off between predictive performance and fairness based on user requirements. These results demonstrate that algorithmic bias in decision tree models can be mitigated through multi-task, fairness-aware learning. All code and datasets used in this study are available at: https://github.com/anonymous12138/groupCART.

Related papers

Fake it till You Make it: Reward Modeling as Discriminative Prediction [49.31309674007382]
GAN-RM is an efficient reward modeling framework that eliminates manual preference annotation and explicit quality dimension engineering.<n>Our method trains the reward model through discrimination between a small set of representative, unpaired target samples.<n>Experiments demonstrate our GAN-RM's effectiveness across multiple key applications.
arXiv Detail & Related papers (2025-06-16T17:59:40Z)
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [53.96714099151378]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models.<n>Our method improves its two key components: minority samples identification and the robust training algorithm.<n>Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z)
Improving Fairness in Credit Lending Models using Subgroup Threshold Optimization [0.0]
We introduce a new fairness technique called textitSubgroup Threshold (textitSTO) STO works by optimizing the classification thresholds for individual subgroups in order to minimize the overall discrimination score between them. Our experiments on a real-world credit lending dataset show that STO can reduce gender discrimination by over 90%.
arXiv Detail & Related papers (2024-03-15T19:36:56Z)
Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages [55.04219793298687]
This paper shows how efficiently-solvable fair ranking models can be integrated into the training loop of Learning to Rank. In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.
arXiv Detail & Related papers (2024-02-07T20:53:53Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
Mitigating Group Bias in Federated Learning for Heterogeneous Devices [1.181206257787103]
Federated Learning is emerging as a privacy-preserving model training approach in distributed edge applications. Our work proposes a group-fair FL framework that minimizes group-bias while preserving privacy and without resource utilization overhead.
arXiv Detail & Related papers (2023-09-13T16:53:48Z)
CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances. We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data. Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z)
Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation. Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS) In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z)
xFAIR: Better Fairness via Model-based Rebalancing of Protected Attributes [15.525314212209564]
Machine learning software can generate models that inappropriately discriminate against specific protected social groups. We propose xFAIR, a model-based extrapolation method, that is capable of both mitigating bias and explaining the cause.
arXiv Detail & Related papers (2021-10-03T22:10:14Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Characterizing Fairness Over the Set of Good Models Under Selective Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance. We provide tractable algorithms to compute the range of attainable group-level predictive disparities. We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z)
Fairness by Explicability and Adversarial SHAP Learning [0.0]
We propose a new definition of fairness that emphasises the role of an external auditor and model explicability. We develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model. We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset.
arXiv Detail & Related papers (2020-03-11T14:36:34Z)
Counterfactual fairness: removing direct effects through regularization [0.0]
We propose a new definition of fairness that incorporates causality through the Controlled Direct Effect (CDE) We develop regularizations to tackle classical fairness measures and present a causal regularization that satisfies our new fairness definition. Our results were found to mitigate unfairness from the predictions with small reductions in model performance.
arXiv Detail & Related papers (2020-02-25T10:13:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.