Related papers: Memory-Conditioned Flow-Matching for Stable Autoregressive PDE Rollouts

Memory-Conditioned Flow-Matching for Stable Autoregressive PDE Rollouts

URL: http://arxiv.org/abs/2602.06689v1
Date: Fri, 06 Feb 2026 13:21:52 GMT
Title: Memory-Conditioned Flow-Matching for Stable Autoregressive PDE Rollouts
Authors: Victor Armegioiu,
Abstract summary: Autoregressive generative PDE solvers can be accurate one step ahead yet drift over long rollouts.<n>We show that eliminating unresolved variables yields an exact resolved evolution with a Markov term.<n>We then derive discrete Grnwall rollout bounds that separate memory approximation from conditional generation error.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autoregressive generative PDE solvers can be accurate one step ahead yet drift over long rollouts, especially in coarse-to-fine regimes where each step must regenerate unresolved fine scales. This is the regime of diffusion and flow-matching generators: although their internal dynamics are Markovian, rollout stability is governed by per-step \emph{conditional law} errors. Using the Mori--Zwanzig projection formalism, we show that eliminating unresolved variables yields an exact resolved evolution with a Markov term, a memory term, and an orthogonal forcing, exposing a structural limitation of memoryless closures. Motivated by this, we introduce memory-conditioned diffusion/flow-matching with a compact online state injected into denoising via latent features. Via disintegration, memory induces a structured conditional tail prior for unresolved scales and reduces the transport needed to populate missing frequencies. We prove Wasserstein stability of the resulting conditional kernel. We then derive discrete Grönwall rollout bounds that separate memory approximation from conditional generation error. Experiments on compressible flows with shocks and multiscale mixing show improved accuracy and markedly more stable long-horizon rollouts, with better fine-scale spectral and statistical fidelity.

Related papers

Generalizing GNNs with Tokenized Mixture of Experts [75.8310720413187]
We show that improving stability requires reducing reliance on shift-sensitive features, leaving an irreducible worst-case generalization floor.<n>We propose STEM-GNN, a pretrain-then-finetune framework with a mixture-of-experts encoder for diverse computation paths.<n>Across nine node, link, and graph benchmarks, STEM-GNN achieves a stronger three-way balance, improving robustness to degree/homophily shifts and to feature/edge corruptions while remaining competitive on clean graphs.
arXiv Detail & Related papers (2026-02-09T22:48:30Z)
FluxNet: Learning Capacity-Constrained Local Transport Operators for Conservative and Bounded PDE Surrogates [13.645285242786008]
We introduce a framework for learning conservative transport operators on regular grids.<n>Instead of predicting the next state, the model outputs local transport operators that update cells through neighborhood exchanges.<n>Experiments on shallow-water equations and traffic flow show improved rollout stability and physical consistency over strong baselines.
arXiv Detail & Related papers (2026-02-02T10:44:10Z)
Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference [1.7523718031184992]
We identify a fundamental mechanism for this failure: textbfPremature Mode Collapse.<n>We propose textbfEfficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC), an adaptive scheduling algorithm that monitors the stability of the inference process.
arXiv Detail & Related papers (2026-01-30T14:47:18Z)
Stationary Reweighting Yields Local Convergence of Soft Fitted Q-Iteration [40.322273308230606]
We show that fitted Q-iteration and its entropy-regularized variant, soft FQI, behave poorly under function approximation and distribution shift.<n>We introduce stationary-reweighted soft FQI, which reweights each regression update using the stationary distribution of the current policy.<n>Our analysis suggests that global convergence may be recovered by gradually reducing the softmax temperature.
arXiv Detail & Related papers (2025-12-30T00:58:35Z)
Breaking the Memory Wall: Exact Analytical Differentiation via Tiled Operator-Space Evolution [3.551701030393209]
Phase Gradient Flow (PGF) is a framework that computes exact analytical derivatives by operating directly in the state-space manifold.<n>Our method delivers O(1) memory complexity relative to sequence length, yielding a 94% reduction in peak VRAM and a 23x increase in throughput compared to standard Autograd.<n>Our work enables chromosome-scale sensitivity analysis on a single GPU, bridging the gap between theoretical infinite-context models and practical hardware limitations.
arXiv Detail & Related papers (2025-12-28T20:27:58Z)
Gated KalmaNet: A Fading Memory Layer Through Test-Time Ridge Regression [53.48692193399171]
Gated KalmaNet (GKA) is a layer that reduces the gap by accounting for the full past when predicting the next token.<n>We solve an online ridge regression problem at test time, with constant memory and linear compute cost in the sequence length.<n>On long-context, GKA excels at real-world RAG and LongQA tasks up to 128k tokens, achieving more than $10$% relative improvement over other fading memory baselines.
arXiv Detail & Related papers (2025-11-26T03:26:37Z)
Schrödinger bridge for generative AI: Soft-constrained formulation and convergence analysis [6.584866740785309]
We study the so-called soft-constrained Schr"odinger bridge problem (SCSBP)<n>We prove that as the penalty grows, both the controls and value functions converge to those of the classical SBP at a linear rate.<n>These results provide the first quantitative convergence guarantees for soft-constrained bridges.
arXiv Detail & Related papers (2025-10-13T18:29:15Z)
Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling [87.34677262370924]
Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token.<n>This creates an 'information void' where semantic information that could be inferred from unmasked tokens is lost between denoising steps.<n>We introduce Continuously Augmented Discrete Diffusion, a framework that augments the discrete state space with a paired diffusion in a continuous latent space.
arXiv Detail & Related papers (2025-10-01T18:00:56Z)
Exact dynamics of quantum dissipative $XX$ models: Wannier-Stark localization in the fragmented operator space [49.1574468325115]
We find an exceptional point at a critical dissipation strength that separates oscillating and non-oscillating decay. We also describe a different type of dissipation that leads to a single decay mode in the whole operator subspace.
arXiv Detail & Related papers (2024-05-27T16:11:39Z)
Generative Fractional Diffusion Models [53.36835573822926]
We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID.
arXiv Detail & Related papers (2023-10-26T17:53:24Z)
Machine learning in and out of equilibrium [58.88325379746631]
Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels. We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium. We propose a new variation of Langevin dynamics (SGLD) that harnesses without replacement minibatching.
arXiv Detail & Related papers (2023-06-06T09:12:49Z)
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS) This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.