Can LLMs Fix Issues with Reasoning Models? Towards More Likely Models
for AI Planning
- URL: http://arxiv.org/abs/2311.13720v2
- Date: Tue, 5 Mar 2024 00:19:24 GMT
- Title: Can LLMs Fix Issues with Reasoning Models? Towards More Likely Models
for AI Planning
- Authors: Turgay Caglar, Sirine Belhaj, Tathagata Chakraborti, Michael Katz,
Sarath Sreedharan
- Abstract summary: This is the first work to look at the application of large language models (LLMs) for the purpose of model space edits in automated planning tasks.
We empirically demonstrate how the performance of an LLM contrasts with search (CS)
Our experiments show promising results suggesting further forays into the exciting world of model space reasoning for planning tasks in the future.
- Score: 26.239075588286127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This is the first work to look at the application of large language models
(LLMs) for the purpose of model space edits in automated planning tasks. To set
the stage for this union, we explore two different flavors of model space
problems that have been studied in the AI planning literature and explore the
effect of an LLM on those tasks. We empirically demonstrate how the performance
of an LLM contrasts with combinatorial search (CS) -- an approach that has been
traditionally used to solve model space tasks in planning, both with the LLM in
the role of a standalone model space reasoner as well as in the role of a
statistical signal in concert with the CS approach as part of a two-stage
process. Our experiments show promising results suggesting further forays of
LLMs into the exciting world of model space reasoning for planning tasks in the
future.
Related papers
- Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models [6.860460230412773]
We propose an LLM-embodied path planning framework for mobile agents.
Our proposed multi-layer architecture uses prompted LLMs in the path planning phase and integrates them with the mobile agents' low-level actuators.
Our experiments show that this framework can improve LLMs' 2D plane reasoning abilities and complete coverage path planning tasks.
arXiv Detail & Related papers (2024-07-02T12:38:46Z) - VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs [102.36953558562436]
Vision language models (VLMs) are an exciting emerging class of language models (LMs)
One understudied capability inVLMs is visual spatial planning.
Our study introduces a benchmark that evaluates the spatial planning capability in these models in general.
arXiv Detail & Related papers (2024-07-02T00:24:01Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z) - Towards Modeling Learner Performance with Large Language Models [7.002923425715133]
This paper investigates whether the pattern recognition and sequence modeling capabilities of LLMs can be extended to the domain of knowledge tracing.
We compare two approaches to using LLMs for this task, zero-shot prompting and model fine-tuning, with existing, non-LLM approaches to knowledge tracing.
While LLM-based approaches do not achieve state-of-the-art performance, fine-tuned LLMs surpass the performance of naive baseline models and perform on par with standard Bayesian Knowledge Tracing approaches.
arXiv Detail & Related papers (2024-02-29T14:06:34Z) - Understanding the planning of LLM agents: A survey [98.82513390811148]
This survey provides the first systematic view of LLM-based agents planning, covering recent works aiming to improve planning ability.
Comprehensive analyses are conducted for each direction, and further challenges in the field of research are discussed.
arXiv Detail & Related papers (2024-02-05T04:25:24Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - GPT-Based Models Meet Simulation: How to Efficiently Use Large-Scale
Pre-Trained Language Models Across Simulation Tasks [0.0]
This paper is the first examination regarding the use of large-scale pre-trained language models for scientific simulations.
The first task is devoted to explaining the structure of a conceptual model to promote the engagement of participants.
The second task focuses on summarizing simulation outputs, so that model users can identify a preferred scenario.
The third task seeks to broaden accessibility to simulation platforms by conveying the insights of simulation visualizations via text.
arXiv Detail & Related papers (2023-06-21T15:42:36Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.