Entropy Production in Machine Learning Under Fokker-Planck Probability Flow
- URL: http://arxiv.org/abs/2601.00554v1
- Date: Fri, 02 Jan 2026 04:01:57 GMT
- Title: Entropy Production in Machine Learning Under Fokker-Planck Probability Flow
- Authors: Lennon Shikhman,
- Abstract summary: We propose an entropy-based retraining framework grounded in non-equilibrium cost dynamics.<n>We show that entropy-triggered retraining achieves predictive performance comparable to high-frequency retraining.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models deployed in nonstationary environments experience performance degradation due to data drift. While many drift detection heuristics exist, most lack a principled dynamical interpretation and provide limited guidance on how retraining frequency should be balanced against operational cost. In this work, we propose an entropy--based retraining framework grounded in nonequilibrium stochastic dynamics. Modeling deployment--time data drift as probability flow governed by a Fokker--Planck equation, we quantify model--data mismatch using a time--evolving Kullback--Leibler divergence. We show that the time derivative of this mismatch admits an entropy--balance decomposition featuring a nonnegative entropy production term driven by probability currents. This interpretation motivates entropy--triggered retraining as a label--free intervention strategy that responds to accumulated mismatch rather than delayed performance collapse. In a controlled nonstationary classification experiment, entropy--triggered retraining achieves predictive performance comparable to high--frequency retraining while reducing retraining events by an order of magnitude relative to daily and label--based policies.
Related papers
- Entropy-Reservoir Bregman Projection: An Information-Geometric Unification of Model Collapse [3.533187668612022]
We present EntropyReser Bregman Projection- ERBP, an information-geometric framework that unifies these phenomena.<n>Our theory yields a necessary condition for collapse, (ii) a sufficient condition that guarantees a non-language entropy floor, and (iii) closed-form rates that depend on sample size.
arXiv Detail & Related papers (2025-12-16T19:50:03Z) - Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z) - ResCP: Reservoir Conformal Prediction for Time Series Forecasting [39.81023599249223]
Conformal prediction offers a powerful framework for building distribution-free prediction intervals for exchangeable data.<n>We propose Reservoir Conformal Prediction (ResCP), a novel training-free conformal prediction method for time series.
arXiv Detail & Related papers (2025-10-06T17:37:44Z) - Neural MJD: Neural Non-Stationary Merton Jump Diffusion for Time Series Prediction [13.819057582932214]
We introduce Neural MJD, a neural network based non-stationary Merton diffusion (MJD) model.<n>Our model explicitly formulates forecasting as a Poisson equation (SDE) simulation problem.<n>To enable tractable learning, we introduce a likelihood truncation mechanism that caps the number of jumps within small time intervals.
arXiv Detail & Related papers (2025-06-05T01:23:28Z) - Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos [32.51861730498945]
We show that an implicit estimator of the score of an invariant measure can be used to stabilize autoregressive emulator rollouts.<n>We show that this model of the score function can be used to stabilize autoregressive rollouts by applying on-the-fly denoising during inference.
arXiv Detail & Related papers (2025-03-24T14:38:33Z) - Error-quantified Conformal Inference for Time Series [55.11926160774831]
Uncertainty quantification in time series prediction is challenging due to the temporal dependence and distribution shift on sequential data.<n>We propose itError-quantified Conformal Inference (ECI) by smoothing the quantile loss function.<n>ECI can achieve valid miscoverage control and output tighter prediction sets than other baselines.
arXiv Detail & Related papers (2025-02-02T15:02:36Z) - Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes [18.344934424278048]
We propose a framework for probabilistic forecasting of dynamical systems based on generative modeling.
We show that the drift and the diffusion coefficients of this SDE can be adjusted after training, and that a specific choice that minimizes the impact of the estimation error gives a F"ollmer process.
arXiv Detail & Related papers (2024-03-20T16:33:06Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Stabilizing Machine Learning Prediction of Dynamics: Noise and
Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems.
In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability.
We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Stochastically forced ensemble dynamic mode decomposition for
forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system.
We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony.
Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.