Related papers: Adam Simplified: Bias Correction Debunked

Adam Simplified: Bias Correction Debunked

URL: http://arxiv.org/abs/2511.20516v2
Date: Wed, 26 Nov 2025 10:07:45 GMT
Title: Adam Simplified: Bias Correction Debunked
Authors: Sam Laing, Antonio Orvieto,
Abstract summary: This paper investigates the role of bias-correction, a feature whose contribution remains poorly understood.<n>Through a series of systematic ablations on vision and language modelling tasks, we demonstrate that the conventional wisdom surrounding bias correction is misleading.
Score: 17.2249234816671
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Adam optimizer is a cornerstone of modern deep learning, yet the empirical necessity of each of its individual components is often taken for granted. This paper presents a focused investigation into the role of bias-correction, a feature whose contribution remains poorly understood. Through a series of systematic ablations on vision and language modelling tasks, we demonstrate that the conventional wisdom surrounding bias correction is misleading. In particular, we demonstrate that in the optimal hyper-parameter configuration, the inclusion of bias correction leads to no improvement in final test performance. Moreover, unless appropriate learning rate scheduling is implemented, the inclusion of bias correction can sometimes be detrimental to performance. We further reinterpret bias correction as a form of implicit learning rate scheduling whose behaviour is strongly dependent on the choice of smoothing hyper-parameters $β_1, β_2 \in [0,1)$. Our findings challenge the universal inclusion of this component.

Related papers

Selective Mixup for Debiasing Question Selection in Computerized Adaptive Testing [50.805231979748434]
Computerized Adaptive Testing (CAT) is a widely used technology for evaluating learners' proficiency in online education platforms.<n> Selection Bias arises because the question selection is strongly influenced by the estimated proficiency.<n>We propose a debiasing framework consisting of two key modules: Cross-Attribute Examinee Retrieval and Selective Mixup-based Regularization.
arXiv Detail & Related papers (2025-11-19T08:55:01Z)
Optimal Information Combining for Multi-Agent Systems Using Adaptive Bias Learning [0.0]
Current approaches either ignore these biases, leading to suboptimal decisions, or require expensive calibration procedures that are often infeasible in practice.<n>This paper addresses the fundamental question: when can we learn and correct for these unknown biases to recover near-optimal performance?<n>We develop a theoretical framework that decomposes biases into learnable systematic components and irreducible components.<n>We show that systems with high learnability ratios can recover significant performance, while those with low learnability show minimal benefit.
arXiv Detail & Related papers (2025-10-28T21:52:33Z)
Debiasify: Self-Distillation for Unsupervised Bias Mitigation [19.813054813868476]
Simplicity bias poses a significant challenge in neural networks, often leading models to favor simpler solutions and inadvertently learn decision rules influenced by spurious correlations. We introduce Debiasify, a novel self-distillation approach that requires no prior knowledge about the nature of biases. Our method leverages a new distillation loss to transfer knowledge within the network, from deeper layers containing complex, highly-predictive features to shallower layers with simpler, attribute-conditioned features in an unsupervised manner.
arXiv Detail & Related papers (2024-11-01T16:25:05Z)
Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs) Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations. Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z)
Class-Imbalanced Graph Learning without Class Rebalancing [62.1368829847041]
Class imbalance is prevalent in real-world node classification tasks and poses great challenges for graph learning models. In this work, we approach the root cause of class-imbalance bias from an topological paradigm. We devise a lightweight topological augmentation framework BAT to mitigate the class-imbalance bias without class rebalancing.
arXiv Detail & Related papers (2023-08-27T19:01:29Z)
Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z)
Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy. We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples. Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z)
Ensembling over Classifiers: a Bias-Variance Perspective [13.006468721874372]
We build upon the extension to the bias-variance decomposition by Pfau (2013) in order to gain crucial insights into the behavior of ensembles of classifiers. We show that conditional estimates necessarily incur an irreducible error. Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction.
arXiv Detail & Related papers (2022-06-21T17:46:35Z)
Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes. It causes severe biases of the head classes (with majority samples) against the tailed ones. We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z)
Automating Control of Overestimation Bias for Continuous Reinforcement Learning [65.63607016094305]
We present a data-driven approach for guiding bias correction. We demonstrate its effectiveness on the Truncated Quantile Critics -- a state-of-the-art continuous control algorithm.
arXiv Detail & Related papers (2021-10-26T09:27:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.