Related papers: Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness

Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness

URL: http://arxiv.org/abs/2504.14882v1
Date: Mon, 21 Apr 2025 06:20:50 GMT
Title: Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness
Authors: Mojtaba Kolahdouzi, Hatice Gunes, Ali Etemad,
Abstract summary: We study whether and how the choice optimization algorithm can impact group fairness in deep neural networks.<n>We show that the choice of optimization indeed influences fairness outcomes, particularly under severe imbalance.<n>Our results highlight the role of adaptive updates as a crucial mechanism for promoting fair outcomes.
Score: 26.49261268883266
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We study whether and how the choice of optimization algorithm can impact group fairness in deep neural networks. Through stochastic differential equation analysis of optimization dynamics in an analytically tractable setup, we demonstrate that the choice of optimization algorithm indeed influences fairness outcomes, particularly under severe imbalance. Furthermore, we show that when comparing two categories of optimizers, adaptive methods and stochastic methods, RMSProp (from the adaptive category) has a higher likelihood of converging to fairer minima than SGD (from the stochastic category). Building on this insight, we derive two new theoretical guarantees showing that, under appropriate conditions, RMSProp exhibits fairer parameter updates and improved fairness in a single optimization step compared to SGD. We then validate these findings through extensive experiments on three publicly available datasets, namely CelebA, FairFace, and MS-COCO, across different tasks as facial expression recognition, gender classification, and multi-label classification, using various backbones. Considering multiple fairness definitions including equalized odds, equal opportunity, and demographic parity, adaptive optimizers like RMSProp and Adam consistently outperform SGD in terms of group fairness, while maintaining comparable predictive accuracy. Our results highlight the role of adaptive updates as a crucial yet overlooked mechanism for promoting fair outcomes.

Related papers

ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning [14.034412856423529]
Direct Preference Optimization (DPO) has gained attention for its simplicity and computational efficiency in aligning large language models (LLMs)<n>Recent advancements have extended DPO to multimodal scenarios, achieving strong performance.<n>Traditional DPO relies on binary preference optimization, rewarding or penalizing entire responses without considering fine-grained segment correctness.<n>We propose Adaptive Sentence-level Preference Optimization (ASPO), which evaluates individual sentences for more precise preference optimization.
arXiv Detail & Related papers (2025-05-25T11:33:08Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance.<n>We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness [27.43137305486112]
We propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss. The results demonstrate that SPO can be seamlessly integrated with existing preference optimization methods to achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-09-26T12:37:26Z)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values. We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO) Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z)
Evolutionary Multi-Objective Optimisation for Fairness-Aware Self Adjusting Memory Classifiers in Data Streams [2.8366371519410887]
We introduce a novel approach to enhance fairness in machine learning algorithms applied to data stream classification. The proposed approach integrates the strengths of the self-adjusting memory K-Nearest-Neighbour algorithm with evolutionary multi-objective optimisation. We show that the proposed approach maintains competitive accuracy and significantly reduces discrimination.
arXiv Detail & Related papers (2024-04-18T10:59:04Z)
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms [13.134564730161983]
This paper adopts a novel approach to deep learning optimization, focusing on gradient descent (SGD) and its variants. We show that SGD and its variants demonstrate performance on par with flat-minimas like SAM, albeit with half the gradient evaluations. Our study uncovers several key findings regarding the relationship between training loss and hold-out accuracy, as well as the comparable performance of SGD and noise-enabled variants.
arXiv Detail & Related papers (2024-03-01T14:55:22Z)
MADA: Meta-Adaptive Optimizers through hyper-gradient Descent [73.1383658672682]
We introduce Meta-Adaptives (MADA), a unified framework that can generalize several known convergences and dynamically learn the most suitable one during training. We empirically compare MADA to other populars on vision and language tasks, and find that MADA consistently outperforms Adam and other populars. We also propose AVGrad, a modification of AMS that replaces the maximum operator with averaging, which is more suitable for hyper-gradient optimization.
arXiv Detail & Related papers (2024-01-17T00:16:46Z)
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework. We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
arXiv Detail & Related papers (2023-07-02T18:16:06Z)
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation. We then analyze the sufficient conditions to guarantee fairness for the target dataset. Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z)
Empirical Study on Optimizer Selection for Out-of-Distribution Generalization [16.386766049451317]
Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution. In this study, we examine the performance of popular first-order generalizations for different classes of distributional shift.
arXiv Detail & Related papers (2022-11-15T23:56:30Z)
Fairness via Adversarial Attribute Neighbourhood Robust Learning [49.93775302674591]
We propose a principled underlineRobust underlineAdversarial underlineAttribute underlineNeighbourhood (RAAN) loss to debias the classification head.
arXiv Detail & Related papers (2022-10-12T23:39:28Z)
Fair and Green Hyperparameter Optimization via Multi-objective and Multiple Information Source Bayesian Optimization [0.19116784879310028]
FanG-HPO uses subsets of the large dataset (aka information sources) to obtain cheap approximations of both accuracy and fairness. Experiments consider two benchmark (fairness) datasets and two machine learning algorithms.
arXiv Detail & Related papers (2022-05-18T10:07:21Z)
Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction [19.08180531016811]
We develop a novel framework that adds regularizers of the sparse group lasso to a family of adaptives in deep learning.<n>We establish proven convergence guarantees in the theoretically convex settings.<n>Our methods can achieve extremely high sparsity with significantly better or highly competitive performance.
arXiv Detail & Related papers (2021-07-30T05:33:43Z)
Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data. There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups. We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.