Unveiling the Basin-Like Loss Landscape in Large Language Models
- URL: http://arxiv.org/abs/2505.17646v2
- Date: Wed, 08 Oct 2025 04:36:39 GMT
- Title: Unveiling the Basin-Like Loss Landscape in Large Language Models
- Authors: Huanran Chen, Yinpeng Dong, Zeming Wei, Yao Huang, Yichi Zhang, Hang Su, Jun Zhu,
- Abstract summary: We observe that pre-training creates a textitbasic capability basin, and subsequent alignment fine-tuning forms textitspecific capability basins.<n>We find that adversarial fine-tuning moves along the nearly worst-case directions, thus rapidly degrading model capabilities.
- Score: 64.07900377968143
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We discover the emergence of \textit{basins} in the loss landscape of large language models. As model scale increases, LLMs become progressively more resilient to random perturbations in the parameter space, giving rise to expansive stability regions where models exhibit nearly identical performance, but outside of which their capabilities collapse. We observe that pre-training creates a \textit{basic capability} basin, and subsequent alignment fine-tuning forms \textit{specific capability} basins (e.g., safety, math, coding). Thus, we argue that benign fine-tuning confined to the basin should preserve prior capabilities. Besides, we also analyze the loss landscape for worst-case directions, which is consistently sharp and detrimental. We find that adversarial fine-tuning moves along the nearly worst-case directions, thus rapidly degrading model capabilities. Finally, we provide a theoretical analysis demonstrating that the basin size bounds the performance degradation of any fine-tuning, including the adversarial ones, while also guaranteeing the model robustness w.r.t. input perturbations, suggesting the benefit of enlarging basins.
Related papers
- Mitigating Reward Hacking in RLHF via Bayesian Non-negative Reward Modeling [49.41422138354821]
We propose a principled reward modeling framework that integrates non-negative factor analysis into the Bradley-Terry preference model.<n>BNRM represents rewards through a sparse, non-negative latent factor generative process.<n>We show that BNRM substantially mitigates reward over-optimization, improves robustness under distribution shifts, and yields more interpretable reward decompositions than strong baselines.
arXiv Detail & Related papers (2026-02-11T08:14:11Z) - Understanding Degradation with Vision Language Model [56.09241449206817]
Understanding visual degradations is a critical yet challenging problem in computer vision.<n>We introduce DU-VLM, a multimodal chain-of-thought model trained with supervised fine-tuning and reinforcement learning.<n>We also introduce textbfDU-110k, a large-scale dataset comprising 110,000 clean-degraded pairs with grounded physical annotations.
arXiv Detail & Related papers (2026-02-04T13:51:15Z) - Reducing Memorisation in Generative Models via Riemannian Bayesian Inference [40.41090345118905]
We build a predictive posterior that better captures the variability of the data distribution.<n>We demonstrate that the proposed approach reduces memorisation while preserving generalisation.<n>Overall, our work illustrates how considering the geometry of the loss enables effective use of the parameter space.
arXiv Detail & Related papers (2026-01-30T11:08:51Z) - Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding [25.87853252053879]
We introduce WaterCaption, the first captioning dataset specifically designed for waterway environments.<n>WaterCaption focuses on fine-grained, multi-region long-text descriptions.<n>We propose Da Yu, an edge-deployable multi-modal large language model for USVs.
arXiv Detail & Related papers (2025-06-24T03:48:48Z) - Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods [11.695512384798299]
Supervised fine-tuning is the dominant approach for adapting foundation models to specialized tasks.<n>In vision models, ensembling a pretrained model with its fine-tuned counterpart has been shown to mitigate this issue.<n>We observe an overadaptation phenomenon: the ensemble model not only retains general knowledge from the foundation model but also outperforms the fine-tuned model even on the fine-tuning domain itself.
arXiv Detail & Related papers (2025-06-02T17:23:16Z) - Do Language Models Use Their Depth Efficiently? [53.56816097840505]
We analyze the residual stream of the Llama 3.1 and Qwen 3 family of models.<n>We find that layers in the second half contribute much less than those in the first half.<n>For multihop tasks, we are unable to find evidence that models are using increased depth to compose subresults.
arXiv Detail & Related papers (2025-05-20T04:00:56Z) - Energy-Conserving Neural Network Closure Model for Long-Time Accurate and Stable LES [0.0]
We develop a skew-symmetric neural architecture as closure model that enforces stability while preserving key physical conservation laws.<n>Our approach leverages a discretization that ensures mass, momentum, and energy conservation, along with a face-averaging filter to maintain mass conservation in coarse-grained velocity fields.
arXiv Detail & Related papers (2025-04-08T09:49:18Z) - Understanding Flatness in Generative Models: Its Role and Benefits [9.775257597631244]
Flat minima, known to enhance robustness in supervised learning, remain largely unexplored in generative models.<n>We establish a theoretical claim that flatter minima improve robustness against perturbations in target prior distributions.<n>We demonstrate that flat minima in diffusion models indeed improve not only generative performance but also robustness.
arXiv Detail & Related papers (2025-03-14T04:38:53Z) - HYPNOS : Highly Precise Foreground-focused Diffusion Finetuning for Inanimate Objects [1.706656684496508]
A robust diffusion model is determined by its ability to perform near-perfect reconstruction of certain product outcomes.
The current prominent diffusion-based finetuning technique falls short in maintaining the foreground object consistency.
We propose Hypnos, a highly precise foreground-focused diffusion finetuning technique.
arXiv Detail & Related papers (2024-10-18T08:20:37Z) - Depth Anything V2 [84.88796880335283]
V2 produces much finer and more robust depth predictions through three key practices.
We replace all labeled real images with synthetic images, scale up the capacity of our teacher model, and teach student models via the bridge of large-scale pseudo-labeled real images.
Benefiting from their strong generalization capability, we fine-tune them with metric depth labels to obtain our metric depth models.
arXiv Detail & Related papers (2024-06-13T17:59:56Z) - Super Consistency of Neural Network Landscapes and Learning Rate Transfer [72.54450821671624]
We study the landscape through the lens of the loss Hessian.
We find that certain spectral properties under $mu$P are largely independent of the size of the network.
We show that in the Neural Tangent Kernel (NTK) and other scaling regimes, the sharpness exhibits very different dynamics at different scales.
arXiv Detail & Related papers (2024-02-27T12:28:01Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - On the Embedding Collapse when Scaling up Recommendation Models [53.66285358088788]
We identify the embedding collapse phenomenon as the inhibition of scalability, wherein the embedding matrix tends to occupy a low-dimensional subspace.
We propose a simple yet effective multi-embedding design incorporating embedding-set-specific interaction modules to learn embedding sets with large diversity.
arXiv Detail & Related papers (2023-10-06T17:50:38Z) - Uncertainty Quantification in Inverse Models in Hydrology [9.020366051310384]
We propose a knowledge-guided, probabilistic inverse modeling method for recovering physical characteristics from streamflow and weather data.
We compare our framework with state-of-the-art inverse models for estimating river basin characteristics.
Our framework also offers improved explainability since it can quantify uncertainty in both the inverse and the forward model.
arXiv Detail & Related papers (2023-10-03T16:39:21Z) - Fine-tuning can cripple your foundation model; preserving features may be the solution [87.35911633187204]
A fine-tuned model's ability to recognize concepts on tasks is reduced significantly compared to its pre-trained counterpart.
We propose a new fine-tuning method called $textitLDIFS$ that, while learning new concepts related to the downstream task, allows a model to preserve its pre-trained knowledge as well.
arXiv Detail & Related papers (2023-08-25T11:49:51Z) - Can we avoid Double Descent in Deep Neural Networks? [3.1473798197405944]
Double descent has caught the attention of the deep learning community.
It raises serious questions about the optimal model's size to maintain high generalization.
Our work shows that the double descent phenomenon is potentially avoidable with proper conditioning of the learning problem.
arXiv Detail & Related papers (2023-02-26T08:12:28Z) - An evaluation of deep learning models for predicting water depth
evolution in urban floods [59.31940764426359]
We compare different deep learning models for prediction of water depth at high spatial resolution.
Deep learning models are trained to reproduce the data simulated by the CADDIES cellular-automata flood model.
Our results show that the deep learning models present in general lower errors compared to the other methods.
arXiv Detail & Related papers (2023-02-20T16:08:54Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - Likelihood Landscapes: A Unifying Principle Behind Many Adversarial
Defenses [15.629921195632857]
We investigate the potential effect defense techniques have on the geometry of the likelihood landscape.
A subset of adversarial defense techniques results in a similar effect of flattening the likelihood landscape.
arXiv Detail & Related papers (2020-08-25T22:51:51Z) - The Global Landscape of Neural Networks: An Overview [23.79848233534269]
Recent success of neural networks suggests that their loss is not too bad, but what do we know about the landscape?
We discuss a few rigorous results on their geometric properties wide networks such as "no bad" paths, and some modifications that eliminate suboptimal local minima and/or decreasing visualization to infinity.
arXiv Detail & Related papers (2020-07-02T22:50:20Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.