Related papers: Spilled Energy in Large Language Models

Spilled Energy in Large Language Models

URL: http://arxiv.org/abs/2602.18671v3
Date: Fri, 27 Feb 2026 20:47:44 GMT
Title: Spilled Energy in Large Language Models
Authors: Adrian Robert Minut, Hazem Dewidar, Iacopo Masi,
Abstract summary: We reinterpret the final Large Language Model (LLM) softmax classifier as an Energy-Based Model (EBM)<n>This principled approach allows us to track "energy spills" during decoding, which we empirically show correlate with factual errors, biases, and failures.
Score: 3.434649016649368
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We reinterpret the final Large Language Model (LLM) softmax classifier as an Energy-Based Model (EBM), decomposing the sequence-to-sequence probability chain into multiple interacting EBMs at inference. This principled approach allows us to track "energy spills" during decoding, which we empirically show correlate with factual errors, biases, and failures. Similar to Orgad et al. (2025), our method localizes the exact answer token and subsequently tests for hallucinations. Crucially, however, we achieve this without requiring trained probe classifiers or activation ablations. Instead, we introduce two completely training-free metrics derived directly from output logits: spilled energy, which captures the discrepancy between energy values across consecutive generation steps that should theoretically match, and marginalized energy, which is measurable at a single step. Evaluated on nine benchmarks across state-of-the-art LLMs (including LLaMA, Mistral, and Gemma) and on synthetic algebraic operations (Qwen3), our approach demonstrates robust, competitive hallucination detection and cross-task generalization. Notably, these results hold for both pretrained and instruction-tuned variants without introducing any training overhead. Code available at: github.com/OmnAI-Lab/spilled-energy

Related papers

A Diffusive Classification Loss for Learning Energy-based Generative Models [27.078167178968076]
We introduce the Diffusive Classification (DiffCLF) objective, a simple method that avoids blindness while remaining computationally efficient.<n>We validate the effectiveness of DiffCLF by comparing the estimated energies against ground truth in analytical Gaussian mixture cases.<n>Our results show that DiffCLF enables EBMs with higher fidelity and broader applicability than existing approaches.
arXiv Detail & Related papers (2026-01-28T20:37:53Z)
Energy-Guided Flow Matching Enables Few-Step Conformer Generation and Ground-State Identification [45.52894539097255]
We present EnFlow, a unified framework that couples flow matching with an explicitly learned energy model.<n>By incorporating energy-gradient guidance during sampling, our method steers trajectories toward lower-energy regions.<n>The learned energy function further enables efficient energy-based ranking of generated ensembles for accurate ground-state identification.
arXiv Detail & Related papers (2025-12-27T14:00:22Z)
Energy-Based Transformers are Scalable Learners and Thinkers [84.7474634026213]
Energy-Based Transformers (EBTs) are a new class of Energy-Based Models (EBMs)<n>We train EBTs to assign an energy value to every input and candidate-prediction pair, enabling predictions through gradient descent-based energy until convergence.<n>During inference, EBTs improve performance with System 2 Thinking by 29% more than the Transformer++ on language tasks.
arXiv Detail & Related papers (2025-07-02T19:17:29Z)
Learning to Rank Chain-of-Thought: Using a Small Model [77.75522308463667]
This paper introduces the Energy Outcome Reward Model (EORM), a highly efficient, lightweight verifier designed to address this challenge.<n>EORM uses an energy-based framework to rank Chain-of-Thought (CoT) solutions, learning to distinguish correct from incorrect reasoning using only simple outcome labels.<n>With only 55M parameters, over 127 times smaller than typical reward models, EORM boosts the accuracy of Llama 3 8B to 90.7% on GSM8k and 63.7% on MATH.
arXiv Detail & Related papers (2025-05-21T01:06:29Z)
Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching [33.9461078261722]
We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities.<n>We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates.<n>We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models.
arXiv Detail & Related papers (2025-04-16T02:20:06Z)
Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models [104.55763564037831]
We train a regression model that leverages attention maps, probabilities on the current generation step, and recurrently computed uncertainty scores from previously generated tokens.<n>Our evaluation shows that the proposed method is highly effective for selective generation, achieving substantial improvements over rivaling unsupervised and supervised approaches.
arXiv Detail & Related papers (2024-08-20T09:42:26Z)
Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy. At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z)
BOOST: Harnessing Black-Box Control to Boost Commonsense in LMs' Generation [60.77990074569754]
We present a computation-efficient framework that steers a frozen Pre-Trained Language Model towards more commonsensical generation. Specifically, we first construct a reference-free evaluator that assigns a sentence with a commonsensical score. We then use the scorer as the oracle for commonsense knowledge, and extend the controllable generation method called NADO to train an auxiliary head.
arXiv Detail & Related papers (2023-10-25T23:32:12Z)
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function [3.0299876288833345]
Large language models (LLMs) demonstrate impressive language understanding and contextual learning abilities. LLMs often struggle to generate correct reasoning steps and answers despite having high probabilities for the solutions. We propose a method that incorporates Monte Carlo Tree Search (MCTS) and a lightweight energy function to rank decision steps.
arXiv Detail & Related papers (2023-09-01T13:10:54Z)
Pre-training Language Model as a Multi-perspective Course Learner [103.17674402415582]
This study proposes a multi-perspective course learning (MCL) method for sample-efficient pre-training. In this study, three self-supervision courses are designed to alleviate inherent flaws of "tug-of-war" dynamics. Our method significantly improves ELECTRA's average performance by 2.8% and 3.2% absolute points respectively on GLUE and SQuAD 2.0 benchmarks.
arXiv Detail & Related papers (2023-05-06T09:02:10Z)
Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks [41.702175127106784]
Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. We propose a dual formulation of an EBMs algorithm in which the particles are sometimes restarted at random samples drawn from the data set, and show that performing these restarts corresponds to a score every step. These results are illustrated in simple numerical experiments.
arXiv Detail & Related papers (2021-07-11T21:43:18Z)
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework. We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z)
Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models [61.768082640087]
We explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders for natural language understanding tasks. Experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines.
arXiv Detail & Related papers (2021-01-18T01:41:31Z)
Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models [59.039592890187144]
We study the discrepancy between maximum likelihood estimation (MLE) and task measures such as BLEU score for autoregressive neural machine translation (NMT) Samples drawn from an MLE-based trained NMT support the desired distribution -- there are samples with much higher BLEU score compared to the beam decoding output. We use both marginal energy models (over target sentence) and joint energy models (over both source and target sentences) to improve our algorithm.
arXiv Detail & Related papers (2020-09-20T02:50:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.