Related papers: Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs

Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs

URL: http://arxiv.org/abs/2404.18978v1
Date: Mon, 29 Apr 2024 14:53:48 GMT
Title: Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs
Authors: Bahar Radmehr, Adish Singla, Tanja Käser,
Abstract summary: We aim to enhance the generalization capabilities of agents in open-ended text-based learning environments by integrating Reinforcement Learning (RL) with Large Language Models (LLMs) We introduce PharmaSimText, a novel benchmark derived from the PharmaSim virtual pharmacy environment designed for practicing diagnostic conversations. Our results show that RL-based agents excel in task completion but lack in asking quality diagnostic questions.
Score: 22.568925103893182
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There has been a growing interest in developing learner models to enhance learning and teaching experiences in educational environments. However, existing works have primarily focused on structured environments relying on meticulously crafted representations of tasks, thereby limiting the agent's ability to generalize skills across tasks. In this paper, we aim to enhance the generalization capabilities of agents in open-ended text-based learning environments by integrating Reinforcement Learning (RL) with Large Language Models (LLMs). We investigate three types of agents: (i) RL-based agents that utilize natural language for state and action representations to find the best interaction strategy, (ii) LLM-based agents that leverage the model's general knowledge and reasoning through prompting, and (iii) hybrid LLM-assisted RL agents that combine these two strategies to improve agents' performance and generalization. To support the development and evaluation of these agents, we introduce PharmaSimText, a novel benchmark derived from the PharmaSim virtual pharmacy environment designed for practicing diagnostic conversations. Our results show that RL-based agents excel in task completion but lack in asking quality diagnostic questions. In contrast, LLM-based agents perform better in asking diagnostic questions but fall short of completing the task. Finally, hybrid LLM-assisted RL agents enable us to overcome these limitations, highlighting the potential of combining RL and LLMs to develop high-performing agents for open-ended learning environments.

Related papers

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning [45.88626187315028]
Large Language Models (LLMs) are increasingly being explored for building Agents capable of active environmental interaction (e.g., via tool use) to solve complex problems.<n>This paper first revisits and clarifies Reinforcement Learning methodologies for LLM Agents by systematically extending the Markov Decision Process (MDP) framework.<n> Secondly, we introduce Agent-R1, a modular, flexible, and user-friendly training framework for RL-based LLM Agents, designed for straightforward adaptation across diverse task scenarios and interactive environments.
arXiv Detail & Related papers (2025-11-18T13:03:15Z)
A Survey on Agentic Multimodal Large Language Models [84.18778056010629]
We present a comprehensive survey on Agentic Multimodal Large Language Models (Agentic MLLMs)<n>We explore the emerging paradigm of agentic MLLMs, delineating their conceptual foundations and distinguishing characteristics from conventional MLLM-based agents.<n>To further accelerate research in this area for the community, we compile open-source training frameworks, training and evaluation datasets for developing agentic MLLMs.
arXiv Detail & Related papers (2025-10-13T04:07:01Z)
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions [80.12078194093013]
We present the first comprehensive survey of hallucinations in LLM-based agents.<n>We propose a new taxonomy that identifies different types of agent hallucinations occurring at different stages.<n>We conduct an in-depth examination of eighteen triggering causes underlying the emergence of agent hallucinations.
arXiv Detail & Related papers (2025-09-23T13:24:48Z)
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning [129.44038804430542]
We introduce AgentGym-RL, a new framework to train LLM agents for multi-turn interactive decision-making through RL.<n>We propose ScalingInter-RL, a training approach designed for exploration-exploitation balance and stable RL optimization.<n>Our agents match or surpass commercial models on 27 tasks across diverse environments.
arXiv Detail & Related papers (2025-09-10T16:46:11Z)
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [103.32591749156416]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z)
An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents [34.25887147052966]
Reinforcement learning (RL) has demonstrated strong potential in training large language models (LLMs) capable of complex reasoning for real-world problem solving.<n>More recently, RL has been leveraged to create sophisticated LLM-based search agents that adeptly combine reasoning with search engine use.<n>Key factors -- such as (1) reward formulation, (2) the choice and characteristics of the underlying LLM, and (3) the role of the search engine in the RL process -- require further investigation.
arXiv Detail & Related papers (2025-05-21T05:09:43Z)
The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners [3.5083201638203154]
We evaluate the role of agentic sophistication in shaping artificial reasoners' performance.<n>We benchmarked three agent designs: a simple game-theoretic model, an unstructured LLM-as-agent model, and an LLM integrated into a traditional agentic framework.<n>Our analysis, covering over 2000 reasoning samples across 25 agent configurations, shows that human-inspired cognitive structures can enhance LLM agents' alignment with human strategic behaviour.
arXiv Detail & Related papers (2025-05-14T13:51:24Z)
LLM Agents for Education: Advances and Applications [49.3663528354802]
Large Language Model (LLM) agents have demonstrated remarkable capabilities in automating tasks and driving innovation across diverse educational applications. This survey aims to provide a comprehensive technological overview of LLM agents for education, fostering further research and collaboration to enhance their impact for the greater good of learners and educators alike.
arXiv Detail & Related papers (2025-03-14T11:53:44Z)
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning [54.787341008881036]
We introduce Reinforced Meta-thinking Agents (ReMA), a novel framework that leverages Multi-Agent Reinforcement Learning (MARL) to elicit meta-thinking behaviors. ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions. Experimental results demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks.
arXiv Detail & Related papers (2025-03-12T16:05:31Z)
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models. Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start. Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z)
Mental Modeling of Reinforcement Learning Agents by Language Models [14.668006477454616]
This study empirically examines, for the first time, how well large language models can build a mental model of agents. This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour.
arXiv Detail & Related papers (2024-06-26T17:14:45Z)
EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents [65.38474102119181]
We propose EnvGen, a framework to adaptively create training environments. We train a small RL agent in a mixture of the original and LLM-generated environments. We find that a small RL agent trained with EnvGen can outperform SOTA methods, including a GPT-4 agent, and learns long-horizon tasks significantly faster.
arXiv Detail & Related papers (2024-03-18T17:51:16Z)
How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback. Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities. We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z)
AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System [91.41155892086252]
We open-source a new AI agent library, AgentLite, which simplifies research investigation into LLM agents. AgentLite is a task-oriented framework designed to enhance the ability of agents to break down tasks. We introduce multiple practical applications developed with AgentLite to demonstrate its convenience and flexibility.
arXiv Detail & Related papers (2024-02-23T06:25:20Z)
Large Language Model-based Human-Agent Collaboration for Complex Task Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving. We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC. This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z)
Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study [1.3597551064547502]
We employ a teacher-student learning framework to tackle problems of Large Language Models (LLMs) and reinforcement learning (RL) models. Within this framework, the LLM acts as a teacher, while the RL model acts as a student. We propose a practical algorithm to address the problem and conduct empirical experiments to evaluate the effectiveness of our method.
arXiv Detail & Related papers (2024-01-12T14:35:57Z)
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects [32.91556128291915]
This paper surveys current research to provide an in-depth overview of intelligent agents within single and multi-agent systems. It covers their definitions, research frameworks, and foundational components such as their composition, cognitive and planning methods, tool utilization, and responses to environmental feedback. We conclude by envisioning prospects for LLM-based agents, considering the evolving landscape of AI and natural language processing.
arXiv Detail & Related papers (2024-01-07T09:08:24Z)
On Diagnostics for Understanding Agent Training Behaviour in Cooperative MARL [5.124364759305485]
We argue that relying solely on the empirical returns may obscure crucial insights into agent behaviour. In this paper, we explore the application of explainable AI (XAI) tools to gain profound insights into agent behaviour.
arXiv Detail & Related papers (2023-12-13T19:10:10Z)
Theory of Mind for Multi-Agent Collaboration via Large Language Models [5.2767999863286645]
This study evaluates Large Language Models (LLMs)-based agents in a multi-agent cooperative text game with Theory of Mind (ToM) inference tasks. We observed evidence of emergent collaborative behaviors and high-order Theory of Mind capabilities among LLM-based agents.
arXiv Detail & Related papers (2023-10-16T07:51:19Z)
ExpeL: LLM Agents Are Experiential Learners [60.54312035818746]
We introduce the Experiential Learning (ExpeL) agent to allow learning from agent experiences without requiring parametric updates. Our agent autonomously gathers experiences and extracts knowledge using natural language from a collection of training tasks. At inference, the agent recalls its extracted insights and past experiences to make informed decisions.
arXiv Detail & Related papers (2023-08-20T03:03:34Z)
AgentBench: Evaluating LLMs as Agents [88.45506148281379]
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. We present AgentBench, a benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities.
arXiv Detail & Related papers (2023-08-07T16:08:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.