Enhanced Transformer architecture for in-context learning of dynamical systems
- URL: http://arxiv.org/abs/2410.03291v1
- Date: Fri, 4 Oct 2024 10:05:15 GMT
- Title: Enhanced Transformer architecture for in-context learning of dynamical systems
- Authors: Matteo Rufolo, Dario Piga, Gabriele Maroni, Marco Forgione,
- Abstract summary: In this paper, we enhance the original meta-modeling framework through three key innovations.
The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class.
- Score: 0.3749861135832073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model is fed with an observed input/output sequence (context) generated by a real system to predict its behavior in a zero-shot learning fashion. In this paper, we enhance the original meta-modeling framework through three key innovations: by formulating the learning task within a probabilistic framework; by managing non-contiguous context and query windows; and by adopting recurrent patching to effectively handle long context sequences. The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class, highlighting the model's enhanced performance and scalability.
Related papers
- On the adaptation of in-context learners for system identification [0.0]
In-context system identification aims at constructing meta-models to describe classes of systems.
We show how meta-model adaptation can enhance predictive performance in three realistic scenarios.
arXiv Detail & Related papers (2023-12-07T06:51:55Z) - From system models to class models: An in-context learning paradigm [0.0]
We introduce a novel paradigm for system identification, addressing two primary tasks: one-step-ahead prediction and multi-step simulation.
We learn a meta model that represents a class of dynamical systems.
For one-step prediction, a GPT-like decoder-only architecture is utilized, whereas the simulation problem employs an encoder-decoder structure.
arXiv Detail & Related papers (2023-08-25T13:50:17Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - Supervised DKRC with Images for Offline System Identification [77.34726150561087]
Modern dynamical systems are becoming increasingly non-linear and complex.
There is a need for a framework to model these systems in a compact and comprehensive representation for prediction and control.
Our approach learns these basis functions using a supervised learning approach.
arXiv Detail & Related papers (2021-09-06T04:39:06Z) - Meta-learning using privileged information for dynamics [66.32254395574994]
We extend the Neural ODE Process model to use additional information within the Learning Using Privileged Information setting.
We validate our extension with experiments showing improved accuracy and calibration on simulated dynamics tasks.
arXiv Detail & Related papers (2021-04-29T12:18:02Z) - Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models [7.766117084613689]
We learn dynamics models for parametrized families of dynamical systems with varying properties.
We compute an action sequence which, for a limited number of environment interactions, optimally explores the given system.
We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.
arXiv Detail & Related papers (2021-02-22T22:52:39Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z) - Neural Entity Linking: A Survey of Models Based on Deep Learning [82.43751915717225]
This survey presents a comprehensive description of recent neural entity linking (EL) systems developed since 2015.
Its goal is to systemize design features of neural entity linking systems and compare their performance to the remarkable classic methods on common benchmarks.
The survey touches on applications of entity linking, focusing on the recently emerged use-case of enhancing deep pre-trained masked language models.
arXiv Detail & Related papers (2020-05-31T18:02:26Z) - Meta-learning framework with applications to zero-shot time-series
forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework.
residual connections act as a meta-learning adaptation mechanism.
We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.