Related papers: LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law

LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law

URL: http://arxiv.org/abs/2402.00795v4
Date: Wed, 09 Oct 2024 16:02:13 GMT
Title: LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law
Authors: Toni J. B. Liu, Nicolas Boullé, Raphaël Sarfati, Christopher J. Earls,
Abstract summary: A language model trained primarily on texts achieves accurate predictions of dynamical system time series without fine-tuning or prompt engineering. We present a flexible and efficient algorithm for extracting probability density functions of multi-digit numbers directly from LLMs.
Score: 3.281128493853064
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pretrained large language models (LLMs) are surprisingly effective at performing zero-shot tasks, including time-series forecasting. However, understanding the mechanisms behind such capabilities remains highly challenging due to the complexity of the models. We study LLMs' ability to extrapolate the behavior of dynamical systems whose evolution is governed by principles of physical interest. Our results show that LLaMA 2, a language model trained primarily on texts, achieves accurate predictions of dynamical system time series without fine-tuning or prompt engineering. Moreover, the accuracy of the learned physical rules increases with the length of the input context window, revealing an in-context version of neural scaling law. Along the way, we present a flexible and efficient algorithm for extracting probability density functions of multi-digit numbers directly from LLMs.

Related papers

LLM4Fluid: Large Language Models as Generalizable Neural Solvers for Fluid Dynamics [33.520440710387724]
Deep-temporal learning has emerged as a promising paradigm for modeling fluid dynamics.<n>We present a framework that leverages Large Language Models (LLMs) as general neural solvers for fluid dynamics.
arXiv Detail & Related papers (2026-01-29T13:14:48Z)
Uncovering Emergent Physics Representations Learned In-Context by Large Language Models [1.8749305679160366]
Large language models (LLMs) exhibit impressive in-context learning (ICL) abilities, enabling them to solve wide range of tasks via textual prompts alone.<n>Here, we investigate the ICL ability of LLMs, especially focusing on their ability to reason about physics.<n>Using a dynamics forecasting task in physical systems as a proxy, we evaluate whether LLMs can learn physics in context.
arXiv Detail & Related papers (2025-08-17T17:49:17Z)
Reparameterized LLM Training via Orthogonal Equivalence Transformation [54.80172809738605]
We present POET, a novel training algorithm that uses Orthogonal Equivalence Transformation to optimize neurons.<n>POET can stably optimize the objective function with improved generalization.<n>We develop efficient approximations that make POET flexible and scalable for training large-scale neural networks.
arXiv Detail & Related papers (2025-06-09T17:59:34Z)
When can isotropy help adapt LLMs' next word prediction to numerical domains? [53.98633183204453]
It is shown that the isotropic property of LLM embeddings in contextual embedding space preserves the underlying structure of representations.<n> Experiments show that different characteristics of numerical data and model architectures have different impacts on isotropy.
arXiv Detail & Related papers (2025-05-22T05:10:34Z)
Physics Informed Constrained Learning of Dynamics from Static Data [8.346864633675414]
A physics-informed neural network (PINN) models the dynamics of a system by integrating the governing physical laws into the architecture of a neural network. Existing PINN frameworks rely on fully observed time-course data, the acquisition of which could be prohibitive for many systems. In this study, we developed a new PINN learning paradigm, namely Constrained Learning, that enables the approximation of first-order derivatives or motions using non-time course or partially observed data.
arXiv Detail & Related papers (2025-04-17T06:06:53Z)
LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics [56.99021951927683]
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Existing Large Language Models (LLMs) usually perform suboptimally because they neglect the inherent characteristics of time series data. We propose LLM-PS to empower the LLM for TSF by learning the fundamental textitPatterns and meaningful textitSemantics from time series data.
arXiv Detail & Related papers (2025-03-12T11:45:11Z)
Can a Large Language Model Learn Matrix Functions In Context? [3.7478782183628634]
Large Language Models (LLMs) have demonstrated the ability to solve complex tasks through In-Context Learning (ICL) This paper explores the capacity of LLMs to solve non-linear numerical computations, with specific emphasis on functions of the Singular Value Decomposition.
arXiv Detail & Related papers (2024-11-24T00:33:43Z)
CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models [68.64605538559312]
In this paper, we analyze the MLLM instruction tuning from both theoretical and empirical perspectives. Inspired by our findings, we propose a measurement to quantitatively evaluate the learning balance. In addition, we introduce an auxiliary loss regularization method to promote updating of the generation distribution of MLLMs.
arXiv Detail & Related papers (2024-07-29T23:18:55Z)
FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models [15.964726158869777]
Large language models (LLMs) have shown remarkable pattern recognition and reasoning abilities. We introduce FLUID-LLM, a novel framework combining pre-trained LLMs with pre-aware encoding to predict unsteady fluid dynamics. Our results demonstrate that FLUID-LLM effectively integratestemporal information into pre-trained LLMs, enhancing CFD task performance.
arXiv Detail & Related papers (2024-06-06T20:55:40Z)
Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML) VML constrains the parameter space to be human-interpretable natural language. We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
In-Context Learning Dynamics with Random Binary Sequences [16.645695664776433]
We propose a framework that enables us to analyze in-context learning dynamics. Inspired by the cognitive science of human perception, we use random binary sequences as context. In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages.
arXiv Detail & Related papers (2023-10-26T17:54:52Z)
Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems. We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting. Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z)
Graph Neural Prompting with Large Language Models [32.97391910476073]
Graph Neural Prompting (GNP) is a novel plug-and-play method to assist pre-trained language models in learning beneficial knowledge from knowledge graphs. Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks.
arXiv Detail & Related papers (2023-09-27T06:33:29Z)
A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z)
Differentially Private Decoding in Large Language Models [14.221692239892207]
We propose a simple, easy to interpret, and computationally lightweight perturbation mechanism to be applied to an already trained model at the decoding stage. Our perturbation mechanism is model-agnostic and can be used in conjunction with any Large Language Model.
arXiv Detail & Related papers (2022-05-26T20:50:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.