PolicyEvol-Agent: Evolving Policy via Environment Perception and Self-Awareness with Theory of Mind
- URL: http://arxiv.org/abs/2504.15313v1
- Date: Sun, 20 Apr 2025 06:43:23 GMT
- Title: PolicyEvol-Agent: Evolving Policy via Environment Perception and Self-Awareness with Theory of Mind
- Authors: Yajie Yu, Yue Feng,
- Abstract summary: PolicyEvol-Agent is a comprehensive framework characterized by systematically acquiring intentions of others.<n>PolicyEvol-Agent integrates a range of cognitive operations with Theory of Mind alongside internal and external perspectives.
- Score: 9.587070290189507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agents has exhibited significant intelligence in real-word simulations with Large language models (LLMs) due to the capabilities of social cognition and knowledge retrieval. However, existing research on agents equipped with effective cognition chains including reasoning, planning, decision-making and reflecting remains limited, especially in the dynamically interactive scenarios. In addition, unlike human, prompt-based responses face challenges in psychological state perception and empirical calibration during uncertain gaming process, which can inevitably lead to cognition bias. In light of above, we introduce PolicyEvol-Agent, a comprehensive LLM-empowered framework characterized by systematically acquiring intentions of others and adaptively optimizing irrational strategies for continual enhancement. Specifically, PolicyEvol-Agent first obtains reflective expertise patterns and then integrates a range of cognitive operations with Theory of Mind alongside internal and external perspectives. Simulation results, outperforming RL-based models and agent-based methods, demonstrate the superiority of PolicyEvol-Agent for final gaming victory. Moreover, the policy evolution mechanism reveals the effectiveness of dynamic guideline adjustments in both automatic and human evaluation.
Related papers
- Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems [133.45145180645537]
The advent of large language models (LLMs) has catalyzed a transformative shift in artificial intelligence.<n>As these agents increasingly drive AI research and practical applications, their design, evaluation, and continuous improvement present intricate, multifaceted challenges.<n>This survey provides a comprehensive overview, framing intelligent agents within a modular, brain-inspired architecture.
arXiv Detail & Related papers (2025-03-31T18:00:29Z) - Build An Influential Bot In Social Media Simulations With Large Language Models [7.242974711907219]
This study introduces a novel simulated environment that combines Agent-Based Modeling (ABM) with Large Language Models (LLMs)
We present an innovative application of Reinforcement Learning (RL) to replicate the process of opinion leader formation.
Our findings reveal that limiting the action space and incorporating self-observation are key factors for achieving stable opinion leader generation.
arXiv Detail & Related papers (2024-11-29T11:37:12Z) - Metacognition for Unknown Situations and Environments (MUSE) [3.2020845462590697]
We propose the Metacognition for Unknown Situations and Environments (MUSE) framework.
MUSE integrates metacognitive processes--specifically self-awareness and self-regulation--into autonomous agents.
Agents show significant improvements in self-awareness and self-regulation.
arXiv Detail & Related papers (2024-11-20T18:41:03Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Appraisal-Guided Proximal Policy Optimization: Modeling Psychological Disorders in Dynamic Grid World [0.0]
We develop a methodology for modeling psychological disorders using Reinforcement Learning (RL) agents.
We investigated numerous reward-shaping strategies to simulate psychological disorders and regulate the behavior of the agents.
A comparison of various configurations of the modified PPO algorithm identified variants that simulate Anxiety disorder and Obsessive-Compulsive Disorder (OCD)-like behavior in agents.
arXiv Detail & Related papers (2024-07-29T19:19:54Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models [75.89014602596673]
Strategic reasoning requires understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly.
We explore the scopes, applications, methodologies, and evaluation metrics related to strategic reasoning with Large Language Models.
It underscores the importance of strategic reasoning as a critical cognitive capability and offers insights into future research directions and potential improvements.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Agent Alignment in Evolving Social Norms [65.45423591744434]
We propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent.
In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation.
We show that EvolutionaryAgent can align progressively better with the evolving social norms while maintaining its proficiency in general tasks.
arXiv Detail & Related papers (2024-01-09T15:44:44Z) - Reflexion: Language Agents with Verbal Reinforcement Learning [44.85337947858337]
Reflexion is a novel framework to reinforce language agents not by updating weights, but through linguistic feedback.
It is flexible enough to incorporate various types (scalar values or free-form language) and sources (external or internally simulated) of feedback signals.
For example, Reflexion achieves a 91% pass@1 accuracy on the HumanEval coding benchmark, surpassing the previous state-of-the-art GPT-4 that achieves 80%.
arXiv Detail & Related papers (2023-03-20T18:08:50Z) - Learning Complex Spatial Behaviours in ABM: An Experimental
Observational Study [0.0]
This paper explores how Reinforcement Learning can be applied to create emergent agent behaviours.
Running a series of simulations, we demonstrate that agents trained using the novel Proximal Policy optimisation algorithm behave in ways that exhibit properties of real-world intelligent adaptive behaviours.
arXiv Detail & Related papers (2022-01-04T11:56:11Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.