Related papers: Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics

Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics

URL: http://arxiv.org/abs/2509.06322v1
Date: Mon, 08 Sep 2025 04:08:50 GMT
Title: Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics
Authors: Jiajun Bao, Nicolas Boullé, Toni J. B. Liu, Raphaël Sarfati, Christopher J. Earls,
Abstract summary: Large language models (LLMs) have demonstrated emergent in-context learning (ICL) capabilities across a range of tasks.<n>We show that text-trained foundation models can accurately predict dynamics from discretized partial differential equation (PDE) solutions.<n>We analyze token-level output distributions and uncover a consistent ICL progression: beginning with syntactic pattern imitation, transitioning through an exploratory high-entropy phase, and culminating in confident, numerically grounded predictions.
Score: 10.472535430038759
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have demonstrated emergent in-context learning (ICL) capabilities across a range of tasks, including zero-shot time-series forecasting. We show that text-trained foundation models can accurately extrapolate spatiotemporal dynamics from discretized partial differential equation (PDE) solutions without fine-tuning or natural language prompting. Predictive accuracy improves with longer temporal contexts but degrades at finer spatial discretizations. In multi-step rollouts, where the model recursively predicts future spatial states over multiple time steps, errors grow algebraically with the time horizon, reminiscent of global error accumulation in classical finite-difference solvers. We interpret these trends as in-context neural scaling laws, where prediction quality varies predictably with both context length and output length. To better understand how LLMs are able to internally process PDE solutions so as to accurately roll them out, we analyze token-level output distributions and uncover a consistent ICL progression: beginning with syntactic pattern imitation, transitioning through an exploratory high-entropy phase, and culminating in confident, numerically grounded predictions.

Related papers

Stochastic Deep Learning: A Probabilistic Framework for Modeling Uncertainty in Structured Temporal Data [0.0]
I propose a novel framework that integrates differential equations (SDEs) with deep generative models to improve uncertainty in machine learning applications involving structured and temporal data.<n>This approach, termed Latent Differential Inference (SLDI), embeds an It SDE in the latent space of a variational autoencoder.<n>The drift and diffusion terms of the SDE are parameterized by neural networks, enabling data-driven inference and generalizing classical time series models to handle irregular sampling and complex dynamic structure.
arXiv Detail & Related papers (2026-01-08T18:53:59Z)
How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning [15.102926671713668]
We propose ST-SSDL, a Spatio-Temporal series time forecasting framework.<n>It discretizes latent space using learnable prototypes that represent typicaltemporal patterns.<n>Experiments show that ST-SSDL consistently outperforms state-of-the-art baselines across multiple metrics.
arXiv Detail & Related papers (2025-10-06T15:21:13Z)
PDE Solvers Should Be Local: Fast, Stable Rollouts with Learned Local Stencils [20.49015396991881]
We present FINO, a finite-difference-inspired neural architecture that enforces strict locality.<n>FINO replaces fixed finite-difference stencil coefficients with learnable convolutional kernels.<n>It achieves up to 44% lower error and up to around 2times speedups over state-of-the-art operator-learning baselines.
arXiv Detail & Related papers (2025-09-30T12:42:32Z)
When can isotropy help adapt LLMs' next word prediction to numerical domains? [53.98633183204453]
It is shown that the isotropic property of LLM embeddings in contextual embedding space preserves the underlying structure of representations.<n> Experiments show that different characteristics of numerical data and model architectures have different impacts on isotropy.
arXiv Detail & Related papers (2025-05-22T05:10:34Z)
Generative Latent Neural PDE Solver using Flow Matching [8.397730500554047]
We propose a latent diffusion model for PDE simulation that embeds the PDE state in a lower-dimensional latent space.<n>Our framework uses an autoencoder to map different types of meshes onto a unified structured latent grid, capturing complex geometries.<n> Numerical experiments show that the proposed model outperforms several deterministic baselines in both accuracy and long-term stability.
arXiv Detail & Related papers (2025-03-28T16:44:28Z)
MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation [48.41289705783405]
We propose a PDE-embedded network with multiscale time stepping (MultiPDENet)<n>In particular, we design a convolutional filter based on the structure of finite difference with a small number of parameters to optimize.<n>A Physics Block with a 4th-order Runge-Kutta integrator at the fine time scale is established that embeds the structure of PDEs to guide the prediction.
arXiv Detail & Related papers (2025-01-27T12:15:51Z)
PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations [49.80959046861793]
We present PDETime, a novel LMTF model inspired by the principles of Neural PDE solvers. Our experimentation across seven diversetemporal real-world LMTF datasets reveals that PDETime adapts effectively to the intrinsic nature of the data.
arXiv Detail & Related papers (2024-02-25T17:39:44Z)
Uncertainty Quantification for Forward and Inverse Problems of PDEs via Latent Global Evolution [110.99891169486366]
We propose a method that integrates efficient and precise uncertainty quantification into a deep learning-based surrogate model. Our method endows deep learning-based surrogate models with robust and efficient uncertainty quantification capabilities for both forward and inverse problems. Our method excels at propagating uncertainty over extended auto-regressive rollouts, making it suitable for scenarios involving long-term predictions.
arXiv Detail & Related papers (2024-02-13T11:22:59Z)
Neural variational Data Assimilation with Uncertainty Quantification using SPDE priors [28.804041716140194]
Recent advances in the deep learning community enables to address the problem through a neural architecture a variational data assimilation framework.<n>In this work we use the theory of Partial Differential Equations (SPDE) and Gaussian Processes (GP) to estimate both space-and time covariance of the state.
arXiv Detail & Related papers (2024-02-02T19:18:12Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
Generalized Neural Closure Models with Interpretability [28.269731698116257]
We develop a novel and versatile methodology of unified neural partial delay differential equations. We augment existing/low-fidelity dynamical models directly in their partial differential equation (PDE) forms with both Markovian and non-Markovian neural network (NN) closure parameterizations. We demonstrate the new generalized neural closure models (gnCMs) framework using four sets of experiments based on advecting nonlinear waves, shocks, and ocean acidification models.
arXiv Detail & Related papers (2023-01-15T21:57:43Z)
Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations [84.42837346400151]
Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare. Existing causal inference approaches consider regular, discrete-time intervals between observations and treatment decisions. We propose a controllable simulation environment based on a model of tumor growth for a range of scenarios.
arXiv Detail & Related papers (2022-06-16T17:15:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.