Related papers: Large Language Models as Discounted Bayesian Filters

Large Language Models as Discounted Bayesian Filters

URL: http://arxiv.org/abs/2512.18489v1
Date: Sat, 20 Dec 2025 19:56:39 GMT
Title: Large Language Models as Discounted Bayesian Filters
Authors: Jensen Zhang, Jing Yang, Keze Wang,
Abstract summary: We introduce a Bayesian filtering framework to evaluate online inference in Large Language Models (LLMs)<n>We find that while LLM belief updates resemble Bayesian posteriors, they are more accurately characterized by an exponential forgetting filter with a model-specific discount factor smaller than one.<n>Although inherent priors are often miscalibrated, the updating mechanism itself remains structured and principled.
Score: 14.164508061248775
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) demonstrate strong few-shot generalization through in-context learning, yet their reasoning in dynamic and stochastic environments remains opaque. Prior studies mainly focus on static tasks and overlook the online adaptation required when beliefs must be continuously updated, which is a key capability for LLMs acting as world models or agents. We introduce a Bayesian filtering framework to evaluate online inference in LLMs. Our probabilistic probe suite spans both multivariate discrete distributions, such as dice rolls, and continuous distributions, such as Gaussian processes, where ground-truth parameters shift over time. We find that while LLM belief updates resemble Bayesian posteriors, they are more accurately characterized by an exponential forgetting filter with a model-specific discount factor smaller than one. This reveals systematic discounting of older evidence that varies significantly across model architectures. Although inherent priors are often miscalibrated, the updating mechanism itself remains structured and principled. We further validate these findings in a simulated agent task and propose prompting strategies that effectively recalibrate priors with minimal computational cost.

Related papers

A Multi-Criteria Automated MLOps Pipeline for Cost-Effective Cloud-Based Classifier Retraining in Response to Data Distribution Shifts [0.0]
The performance of machine learning (ML) models often deteriorates when the underlying data distribution changes over time.<n> ML Operations (MLOps) is often manual, i.e., humans trigger the process of model retraining and redeployment.<n>We present an automated MLOps pipeline designed to address neural network retraining in response to significant data distribution changes.
arXiv Detail & Related papers (2025-12-12T13:22:14Z)
Simulation-based inference via telescoping ratio estimation for trawl processes [0.0]
We propose a fast, accurate, sample-efficient Markov-based inference framework for intractable processes.<n>We use Chebyshev approximations to efficiently generate independent posterior samples, enabling accurate inference even when chain Monte Carlo methods mix poorly.<n>We demonstrate the method's effectiveness on trawl processes, a class of flexible infinitely divisible models that applied to energy demand data.
arXiv Detail & Related papers (2025-10-05T05:26:46Z)
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs [78.09559830840595]
We present the first systematic study on quantizing diffusion-based language models.<n>We identify the presence of activation outliers, characterized by abnormally large activation values.<n>We implement state-of-the-art PTQ methods and conduct a comprehensive evaluation.
arXiv Detail & Related papers (2025-08-20T17:59:51Z)
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Boosting LLM-based Relevance Modeling with Distribution-Aware Robust Learning [14.224921308101624]
We propose a novel Distribution-Aware Robust Learning framework (DaRL) for relevance modeling.<n>DaRL has been deployed online to serve the Alipay's insurance product search.
arXiv Detail & Related papers (2024-12-17T03:10:47Z)
Mamba-PTQ: Outlier Channels in Recurrent Large Language Models [49.1574468325115]
We show that Mamba models exhibit the same pattern of outlier channels observed in attention-based LLMs. We show that the reason for the difficulty of quantizing SSMs is caused by activation outliers, similar to those observed in transformer-based LLMs.
arXiv Detail & Related papers (2024-07-17T08:21:06Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.