Related papers: PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL

PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL

URL: http://arxiv.org/abs/2601.22891v1
Date: Fri, 30 Jan 2026 12:11:55 GMT
Title: PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL
Authors: Jacques Cloete, Mathias Jackermeier, Ioannis Havoutis, Alessandro Abate,
Abstract summary: linear temporal logic (LTL) is a powerful formalism for specifying structured, temporally extended tasks to RL agents.<n>We present PlatoLTL, a novel approach that enables policies to zero-shot generalize not only compositionally across formula structures, but also parametrically across propositions.
Score: 55.58188508467081
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A central challenge in multi-task reinforcement learning (RL) is to train generalist policies capable of performing tasks not seen during training. To facilitate such generalization, linear temporal logic (LTL) has recently emerged as a powerful formalism for specifying structured, temporally extended tasks to RL agents. While existing approaches to LTL-guided multi-task RL demonstrate successful generalization across LTL specifications, they are unable to generalize to unseen vocabularies of propositions (or "symbols"), which describe high-level events in LTL. We present PlatoLTL, a novel approach that enables policies to zero-shot generalize not only compositionally across LTL formula structures, but also parametrically across propositions. We achieve this by treating propositions as instances of parameterized predicates rather than discrete symbols, allowing policies to learn shared structure across related propositions. We propose a novel architecture that embeds and composes predicates to represent LTL specifications, and demonstrate successful zero-shot generalization to novel propositions and tasks across challenging environments.

Related papers

Zero-Shot Instruction Following in RL via Structured LTL Representations [50.41415009303967]
We study instruction following in multi-task reinforcement learning, where an agent must zero-shot execute novel tasks not seen during training.<n>In this setting, linear temporal logic has recently been adopted as a powerful framework for specifying structured, temporally extended tasks.<n>While existing approaches successfully train generalist policies, they often struggle to effectively capture the rich logical and temporal structure inherent in specifications.
arXiv Detail & Related papers (2026-02-15T23:22:50Z)
Semantically Labelled Automata for Multi-Task Reinforcement Learning with LTL Instructions [61.479946958462754]
We study multi-task reinforcement learning (RL), a setting in which an agent learns a single, universal policy.<n>We present a novel task embedding technique leveraging a new generation of semantic translations-to-automata.
arXiv Detail & Related papers (2026-02-06T14:46:27Z)
Zero-Shot Instruction Following in RL via Structured LTL Representations [54.08661695738909]
Linear temporal logic (LTL) is a compelling framework for specifying complex, structured tasks for reinforcement learning (RL) agents.<n>Recent work has shown that interpreting instructions as finite automata, which can be seen as high-level programs monitoring task progress, enables learning a single generalist policy capable of executing arbitrary instructions at test time.<n>We propose a novel approach to learning a multi-task policy for following arbitrary instructions that addresses this shortcoming.
arXiv Detail & Related papers (2025-12-02T10:44:51Z)
One Subgoal at a Time: Zero-Shot Generalization to Arbitrary Linear Temporal Logic Requirements in Multi-Task Reinforcement Learning [4.819678320271634]
Generalizing to complex and temporally extended task objectives and safety constraints is a critical challenge in reinforcement learning (RL)<n>In this paper, we introduce GenZ-LTL, a method that enables zero-shot generalization to arbitrary specifications.
arXiv Detail & Related papers (2025-08-03T03:17:49Z)
The Synergy of LLMs & RL Unlocks Offline Learning of Generalizable Language-Conditioned Policies with Low-fidelity Data [50.544186914115045]
TEDUO is a novel training pipeline for offline language-conditioned policy learning in symbolic environments.<n>Our approach harnesses large language models (LLMs) in a dual capacity: first, as automatization tools augmenting offline datasets with richer annotations, and second, as generalizable instruction-following agents.
arXiv Detail & Related papers (2024-12-09T18:43:56Z)
DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL [59.01527054553122]
Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks.<n>Existing approaches suffer from several shortcomings.<n>We propose a novel learning approach to address these concerns.
arXiv Detail & Related papers (2024-10-06T21:30:38Z)
TEGEE: Task dEfinition Guided Expert Ensembling for Generalizable and Few-shot Learning [37.09785060896196]
We propose textbfTEGEE (Task Definition Guided Expert Ensembling), a method that explicitly extracts task definitions.<n>Our framework employs a dual 3B model approach, with each model assigned a distinct role.<n> Empirical evaluations show that TEGEE performs comparably to the larger LLaMA2-13B model.
arXiv Detail & Related papers (2024-03-07T05:26:41Z)
LTL2Action: Generalizing LTL Instructions for Multi-Task RL [4.245018630914216]
We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. We employ a well-known formal language -- linear temporal logic (LTL) -- to specify instructions, using a domain-specific vocabulary.
arXiv Detail & Related papers (2021-02-13T04:05:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.