LTLCrit: A Temporal Logic-based LLM Critic for Safe and Efficient Embodied Agents
- URL: http://arxiv.org/abs/2507.03293v1
- Date: Fri, 04 Jul 2025 04:53:53 GMT
- Title: LTLCrit: A Temporal Logic-based LLM Critic for Safe and Efficient Embodied Agents
- Authors: Anand Gokhale, Vaibhav Srivastava, Francesco Bullo,
- Abstract summary: Large language models (LLMs) have demonstrated promise in reasoning tasks and general decision-making in static environments.<n>We propose a modular actor-critic architecture in which an LLM actor is guided byCrit, a trajectory-level LLM critic.<n>Our setup combines the reasoning strengths of language models with the guarantees of formal logic.
- Score: 5.299803738642663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated promise in reasoning tasks and general decision-making in static environments. In long-term planning tasks, however, errors tend to accumulate, often leading to unsafe or inefficient behavior, limiting their use in general-purpose settings. We propose a modular actor-critic architecture in which an LLM actor is guided by LTLCrit, a trajectory-level LLM critic that communicates via linear temporal logic (LTL). Our setup combines the reasoning strengths of language models with the guarantees of formal logic. The actor selects high-level actions from natural language observations, while the critic analyzes full trajectories and proposes new LTL constraints that shield the actor from future unsafe or inefficient behavior. The architecture supports both fixed, hand-specified safety constraints and adaptive, learned soft constraints that promote long-term efficiency. Our architecture is model-agnostic: any LLM-based planner can serve as the actor, and LTLCrit serves as a logic-generating wrapper. We formalize planning as graph traversal under symbolic constraints, allowing LTLCrit to analyze failed or suboptimal trajectories and generate new temporal logic rules that improve future behavior. We evaluate our system on the Minecraft diamond-mining benchmark, achieving 100% completion rates and improving efficiency compared to baseline LLM planners. Our results suggest that enabling LLMs to supervise each other through logic is a powerful and flexible paradigm for safe, generalizable decision making.
Related papers
- ReflecSched: Solving Dynamic Flexible Job-Shop Scheduling via LLM-Powered Hierarchical Reflection [4.101501114944147]
ReflecSched is a framework that empowers the LLM beyond a direct scheduler.<n>It distills simulations across multiple planning horizons into a concise, natural-language summary.<n>This summary is then integrated into the prompt of a final decision-making module, guiding it to produce non-myopic actions.
arXiv Detail & Related papers (2025-08-03T11:26:35Z) - Do LLMs Dream of Discrete Algorithms? [0.7646713951724011]
Large Language Models (LLMs) have rapidly transformed the landscape of artificial intelligence.<n>Their reliance on probabilistic inference limits their effectiveness in domains requiring strict logical reasoning.<n>This paper proposes a neurosymbolic approach that augments LLMs with logic-based reasoning modules.
arXiv Detail & Related papers (2025-06-29T22:03:01Z) - Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL [62.984693936073974]
Large language models (LLMs) excel in tasks like question answering and dialogue.<n>Complex tasks requiring interaction, such as negotiation and persuasion, require additional long-horizon reasoning and planning.<n>We propose a novel approach that uses goal-conditioned value functions to guide the reasoning of LLM agents.
arXiv Detail & Related papers (2025-05-23T16:51:54Z) - Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making.
Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance.
We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z) - Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models [31.558429029429863]
Large Language Models (LLMs) are expected to be predictable and trustworthy to support reliable decision-making systems.<n>This work examines logical preference consistency as a foundational requirement for building more dependable LLM systems.<n>We show that improving consistency leads to better performance in LLM-driven logic-based algorithms.
arXiv Detail & Related papers (2024-10-03T04:34:04Z) - Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning.<n>We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging.<n>We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z) - Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems.
Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z) - Dynamic Planning with a LLM [15.430182858130884]
Large Language Models (LLMs) can solve many NLP tasks in zero-shot settings, but applications involving embodied agents remain problematic.
Our work presents LLM Dynamic Planner (LLM-DP), a neuro-symbolic framework where an LLM works hand-in-hand with a traditional planner to solve an embodied task.
arXiv Detail & Related papers (2023-08-11T21:17:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.