PreAct: Prediction Enhances Agent's Planning Ability
- URL: http://arxiv.org/abs/2402.11534v2
- Date: Thu, 05 Dec 2024 04:40:54 GMT
- Title: PreAct: Prediction Enhances Agent's Planning Ability
- Authors: Dayuan Fu, Jianzhao Huang, Siyuan Lu, Guanting Dong, Yejie Wang, Keqing He, Weiran Xu,
- Abstract summary: We present **PreAct**, an agent framework that integrates **pre**diction, **rea**soning, and **act**ion.<n>By utilizing the information derived from predictions, the large language model (LLM) agent can provide a wider range and more strategically focused reasoning.
- Score: 23.058048254571027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Addressing the disparity between forecasts and actual results can enable individuals to expand their thought processes and stimulate self-reflection, thus promoting accurate planning. In this research, we present **PreAct**, an agent framework that integrates **pre**diction, **rea**soning, and **act**ion. By utilizing the information derived from predictions, the large language model (LLM) agent can provide a wider range and more strategically focused reasoning. This leads to more efficient actions that aid the agent in accomplishing intricate tasks. Our experimental results show that PreAct surpasses the ReAct method in completing complex tasks and that PreAct's performance can be further improved when paired with other memory or selection strategy techniques. We presented the model with varying quantities of historical predictions and discovered that these predictions consistently enhance LLM planning.The variances in single-step reasoning between PreAct and ReAct indicate that PreAct indeed has benefits in terms of diversity and strategic orientation over ReAct.
Related papers
- Latent Diffusion Planning for Imitation Learning [78.56207566743154]
Latent Diffusion Planning (LDP) is a modular approach consisting of a planner and inverse dynamics model.
By separating planning from action prediction, LDP can benefit from the denser supervision signals of suboptimal and action-free data.
On simulated visual robotic manipulation tasks, LDP outperforms state-of-the-art imitation learning approaches.
arXiv Detail & Related papers (2025-04-23T17:53:34Z) - Interpreting Emergent Planning in Model-Free Reinforcement Learning [13.820891288919002]
We present the first evidence that model-free reinforcement learning agents can learn to plan.
This is achieved by applying a methodology based on concept-based interpretability to a model-free agent in Sokoban.
arXiv Detail & Related papers (2025-04-02T16:24:23Z) - Microfoundation Inference for Strategic Prediction [26.277259491014163]
We propose a methodology for learning the distribution map that encapsulates the long-term impacts of predictive models on the population.
Specifically, we model agents' responses as a cost-utility problem and propose estimates for said cost.
We provide a rate of convergence for this proposed estimate and assess its quality through empirical demonstrations on a credit-scoring dataset.
arXiv Detail & Related papers (2024-11-13T19:37:49Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Predicting Future Actions of Reinforcement Learning Agents [27.6973598477153]
This paper experimentally evaluates and compares the effectiveness of future action and event prediction for three types of reinforcement learning agents.
We employ two approaches: the inner state approach, which involves predicting based on the inner computations of the agents, and a simulation-based approach, which involves unrolling the agent in a learned world model.
Using internal plans proves more robust to model quality compared to simulation-based approaches when predicting actions, while the results for event prediction are more mixed.
arXiv Detail & Related papers (2024-10-29T18:48:18Z) - CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection.
Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z) - Performative Prediction on Games and Mechanism Design [69.7933059664256]
We study a collective risk dilemma where agents decide whether to trust predictions based on past accuracy.
As predictions shape collective outcomes, social welfare arises naturally as a metric of concern.
We show how to achieve better trade-offs and use them for mechanism design.
arXiv Detail & Related papers (2024-08-09T16:03:44Z) - From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation [30.161471749050833]
We propose a novel end-to-end video modeling architecture that utilizes attention mechanisms, named Anticipation via Recognition and Reasoning (ARR)
ARR decomposes the action anticipation task into action recognition and reasoning tasks, and effectively learns the statistical relationship between actions by next action prediction (NAP)
In addition, to address the challenge of relationship modeling that requires extensive training data, we propose an innovative approach for the unsupervised pre-training of the decoder.
arXiv Detail & Related papers (2024-08-05T18:38:29Z) - CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning [5.865719902445064]
We propose a novel multi-agent reinforcement learning algorithm CAMMARL.
It involves modeling the actions of other agents in different situations in the form of confident sets.
We show that CAMMARL elevates the capabilities of an autonomous agent in MARL by modeling conformal prediction sets.
arXiv Detail & Related papers (2023-06-19T19:03:53Z) - NashFormer: Leveraging Local Nash Equilibria for Semantically Diverse
Trajectory Prediction [11.319057000888638]
NashFormer is a framework for trajectory prediction that leverages game-theoretic inverse reinforcement learning to improve coverage of multi-modal predictions.
Experiment results show that our predictor produces accurate predictions while covering $33%$ more potential interactions versus a baseline model.
arXiv Detail & Related papers (2023-05-28T00:41:29Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - What Should I Know? Using Meta-gradient Descent for Predictive Feature
Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations.
An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making.
We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z) - A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools
Stock Prediction [100.9772316028191]
In this paper, we experiment with a variety of adversarial attack configurations to fool three stock prediction victim models.
Our results show that the proposed attack method can achieve consistent success rates and cause significant monetary loss in trading simulation.
arXiv Detail & Related papers (2022-05-01T05:12:22Z) - Finding Useful Predictions by Meta-gradient Descent to Improve
Decision-making [1.384055225262046]
We focus on predictions expressed as General Value Functions: temporally extended estimates of the accumulation of a future signal.
One challenge is determining from the infinitely many predictions that the agent could possibly make which might support decision-making.
By learning, rather than manually specifying these predictions, we enable the agent to identify useful predictions in a self-supervised manner.
arXiv Detail & Related papers (2021-11-18T20:17:07Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.