Related papers: On the Unreasonable Effectiveness of Last-layer Retraining

On the Unreasonable Effectiveness of Last-layer Retraining

URL: http://arxiv.org/abs/2512.01766v1
Date: Mon, 01 Dec 2025 15:08:43 GMT
Title: On the Unreasonable Effectiveness of Last-layer Retraining
Authors: John C. Hill, Tyler LaBonte, Xinchen Zhang, Vidya Muthukumar,
Abstract summary: Last-layer retraining (LLR) methods have garnered interest as an efficient approach to rectify dependence on spurious correlations.<n>LLR has been found to improve worst-group accuracy even when the held-out set is an imbalanced subset of the training set.<n>We show how the recent algorithms CB-LLR and AFR perform implicit group-balancing to elicit a robustness improvement.
Score: 11.989603982988344
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Last-layer retraining (LLR) methods -- wherein the last layer of a neural network is reinitialized and retrained on a held-out set following ERM training -- have garnered interest as an efficient approach to rectify dependence on spurious correlations and improve performance on minority groups. Surprisingly, LLR has been found to improve worst-group accuracy even when the held-out set is an imbalanced subset of the training set. We initially hypothesize that this ``unreasonable effectiveness'' of LLR is explained by its ability to mitigate neural collapse through the held-out set, resulting in the implicit bias of gradient descent benefiting robustness. Our empirical investigation does not support this hypothesis. Instead, we present strong evidence for an alternative hypothesis: that the success of LLR is primarily due to better group balance in the held-out set. We conclude by showing how the recent algorithms CB-LLR and AFR perform implicit group-balancing to elicit a robustness improvement.

Related papers

From Absolute to Relative: Rethinking Reward Shaping in Group-Based Reinforcement Learning [7.6602542594279335]
We propose Reinforcement Learning with Relative Rewards to shift reward shaping from absolute scoring to relative ranking.<n>We show that RLRR yields consistent performance improvements over standard group-based baselines across reasoning benchmarks and open-ended generation tasks.
arXiv Detail & Related papers (2026-01-30T15:07:06Z)
A Step Back: Prefix Importance Ratio Stabilizes Policy Optimization [58.116300485427764]
Reinforcement learning post-training can elicit reasoning behaviors in large language models.<n> token-level correction often leads to unstable training dynamics when the degree of off-policyness is large.<n>We propose a simple yet effective objective, Minimum Prefix Ratio (MinPRO)
arXiv Detail & Related papers (2026-01-30T08:47:19Z)
Identifying and Transferring Reasoning-Critical Neurons: Improving LLM Inference Reliability via Activation Steering [50.63386303357225]
We propose AdaRAS, a lightweight test-time framework that improves reasoning reliability by selectively intervening on neuron activations.<n>AdaRAS identifies Reasoning-Critical Neurons (RCNs) via a polarity-aware mean-difference criterion and adaptively steers their activations during inference.<n> Experiments on 10 mathematics and coding benchmarks demonstrate consistent improvements, including over 13% gains on AIME-24 and AIME-25.
arXiv Detail & Related papers (2026-01-27T17:53:01Z)
Your Group-Relative Advantage Is Biased [74.57406620907797]
Group-based learning methods rely on group-relative advantage estimation to avoid learned critics.<n>In this work, we uncover a fundamental issue of group-based RL: the group-relative advantage estimator is inherently biased relative to the true (expected) advantage.<n>We propose History-Aware Adaptive Difficulty Weighting (HA-DW), an adaptive reweighting scheme that adjusts advantage estimates based on an evolving difficulty anchor and training dynamics.
arXiv Detail & Related papers (2026-01-13T13:03:15Z)
CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs [53.749193998004166]
Curriculum learning plays a crucial role in enhancing the training efficiency of large language models.<n>We propose CurES, an efficient training method that accelerates convergence and employs Bayesian posterior estimation to minimize computational overhead.
arXiv Detail & Related papers (2025-10-01T15:41:27Z)
On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization [52.76330545825083]
Reinforcement learning (RL) has become popular in enhancing the reasoning capabilities of large language models (LLMs)<n>We identify a previously unrecognized phenomenon we term Lazy Likelihood Displacement (LLD), wherein the likelihood of correct responses marginally increases or even decreases during training.<n>We develop a method called NTHR, which downweights penalties on tokens contributing to the LLD. Unlike prior DPO-based approaches, NTHR takes advantage of GRPO's group-based structure, using correct responses as anchors to identify influential tokens.
arXiv Detail & Related papers (2025-05-24T18:58:51Z)
Bridging Supervised Learning and Reinforcement Learning in Math Reasoning [55.889740979706815]
Reinforcement Learning (RL) has played a central role in the recent surge of math abilities by enabling self-improvement through binary verifier signals.<n>In this work, we propose Negative-aware Fine-Tuning (NFT) -- a supervised approach that enables LLMs to reflect on their failures and improve autonomously with no external teachers.
arXiv Detail & Related papers (2025-05-23T17:17:40Z)
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce [68.99924691391048]
We revisit GRPO from a reinforce-like algorithm perspective and analyze its core components.<n>We find that a simple rejection sampling baseline, RAFT, yields competitive performance than GRPO and PPO.<n>Motivated by this insight, we propose Reinforce-Rej, a minimal extension of policy gradient that filters both entirely incorrect and entirely correct samples.
arXiv Detail & Related papers (2025-04-15T16:15:02Z)
Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning [6.417080496525403]
We propose Adaptive Group Policy Optimization (AGPO) which uses a simple but effective method, an adaptive loss function, to mitigate training fluctuation and token inefficiency.<n>Experiments demonstrate our method achieves more stable training and superior performance with significantly fewer tokens in reasoning steps.
arXiv Detail & Related papers (2025-03-20T08:48:57Z)
Annotation-Free Group Robustness via Loss-Based Resampling [3.355491272942994]
Training neural networks for image classification with empirical risk minimization makes them vulnerable to relying on spurious attributes instead of causal ones for prediction. We propose a new method, called loss-based feature re-weighting (LFR), in which we infer a grouping of the data by evaluating an ERM-pre-trained model on a small left-out split of the training data. For a complete assessment, we evaluate LFR on various versions of Waterbirds and CelebA datasets with different spurious correlations.
arXiv Detail & Related papers (2023-12-08T08:22:02Z)
Unbiased and Efficient Self-Supervised Incremental Contrastive Learning [31.763904668737304]
We propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of a novel Incremental InfoNCE (NCE-II) loss function. ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results.
arXiv Detail & Related papers (2023-01-28T06:11:31Z)
False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm. We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.