Related papers: Hidden Breakthroughs in Language Model Training

Hidden Breakthroughs in Language Model Training

URL: http://arxiv.org/abs/2506.15872v2
Date: Mon, 23 Jun 2025 11:55:45 GMT
Title: Hidden Breakthroughs in Language Model Training
Authors: Sara Kangaslahti, Elan Rosenfeld, Naomi Saphra,
Abstract summary: This paper argues that similar breakthroughs occur frequently throughout training but are obscured by a loss metric that collapses all variation into a single scalar.<n>We introduce POLCA, a method for decomposing changes in loss along arbitrary bases of the low-rank training subspace.<n>We validate our method on synthetic arithmetic and natural language tasks, showing that POLCA recovers clusters that represent interpretable breakthroughs in the model's capabilities.
Score: 9.183934538035562
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Loss curves are smooth during most of model training, so visible discontinuities stand out as possible conceptual breakthroughs. Studying these breakthroughs enables a deeper understanding of learning dynamics, but only when they are properly identified. This paper argues that similar breakthroughs occur frequently throughout training but they are obscured by a loss metric that collapses all variation into a single scalar. To find these hidden transitions, we introduce POLCA, a method for decomposing changes in loss along arbitrary bases of the low-rank training subspace. We use our method to identify clusters of samples that share similar changes in loss during training, disaggregating the overall loss into that of smaller groups of conceptually similar data. We validate our method on synthetic arithmetic and natural language tasks, showing that POLCA recovers clusters that represent interpretable breakthroughs in the model's capabilities. We demonstrate the promise of these hidden phase transitions as a tool for unsupervised interpretability.

Related papers

Grokking as a Phase Transition between Competing Basins: a Singular Learning Theory Approach [3.551701030393209]
We study grokking, the abrupt transition from memorization to generalisation after extended training.<n>We interpret grokking in quadratic networks as a phase transition between competing near-zero-loss solution basins.<n>Our contributions are two-fold: we derive closed-form expressions for the LLC in quadratic networks trained on modular arithmetic tasks, with the corresponding empirical verification.
arXiv Detail & Related papers (2026-03-01T17:23:29Z)
Weight Factorization and Centralization for Continual Learning in Speech Recognition [55.63455095283984]
Continually training the models in a rehearsal-free, multilingual, and language agnostic condition, likely leads to catastrophic forgetting.<n>Inspired by the ability of human brains to learn and consolidate knowledge through the waking-sleeping cycle, we propose a continual learning approach.
arXiv Detail & Related papers (2025-06-19T19:59:24Z)
Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning [65.85335291827086]
This paper tries to learn and understand underlying semantic variations from distracting videos via offline-to-online latent distillation and flexible disentanglement constraints.<n>We pretrain the action-free video prediction model offline with disentanglement regularization to extract semantic knowledge from distracting videos.<n>For finetuning in the online environment, we exploit the knowledge from the pretrained model and introduce a disentanglement constraint to the world model.
arXiv Detail & Related papers (2025-03-11T13:50:22Z)
Diffusing States and Matching Scores: A New Framework for Imitation Learning [16.941612670582522]
Adversarial Imitation Learning is traditionally framed as a two-player zero-sum game between a learner and an adversarially chosen cost function.<n> diffusion models have emerged as a non-adversarial alternative to GANs that merely require training a score function via regression.<n>We show our approach outperforms both GAN-style imitation learning baselines and discriminator-free imitation learning baselines across various continuous control problems.
arXiv Detail & Related papers (2024-10-17T17:59:25Z)
Temporal-Difference Variational Continual Learning [89.32940051152782]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy. At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z)
A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes. On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z)
Prototypical quadruplet for few-shot class incremental learning [24.814045065163135]
We propose a novel method that improves classification robustness by identifying a better embedding space using an improved contrasting loss. Our approach retains previously acquired knowledge in the embedding space, even when trained with new classes. We demonstrate the effectiveness of our method by showing that the embedding space remains intact after training the model with new classes and outperforms existing state-of-the-art algorithms in terms of accuracy across different sessions.
arXiv Detail & Related papers (2022-11-05T17:19:14Z)
A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff. cutoff relies on sampling consistency and thus adds little computational overhead. cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z)
A Contraction Approach to Model-based Reinforcement Learning [11.701145942745274]
We analyze the error in the cumulative reward using a contraction approach. We prove that branched rollouts can reduce this error. In this case, we show that GAN-type learning has an advantage over Behavioral Cloning when its discriminator is well-trained.
arXiv Detail & Related papers (2020-09-18T02:03:14Z)
Semi-Discriminative Representation Loss for Online Continual Learning [16.414031859647874]
gradient-based approaches have been developed to make more efficient use of compact episodic memory. We propose a simple method -- Semi-Discriminative Representation Loss (SDRL) -- for continual learning.
arXiv Detail & Related papers (2020-06-19T17:13:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.