Related papers: Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

URL: http://arxiv.org/abs/2602.22546v1
Date: Thu, 26 Feb 2026 02:38:25 GMT
Title: Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention
Authors: Zhiming Wang, Jinwei He, Feng Lu,
Abstract summary: AHCE (Active Human-Augmented Challenge Engagement) is a framework for on-demand Human-AI collaboration.<n>Our work demonstrates that successfully augmenting agents requires learning how to request expert reasoning.
Score: 18.166049121801016
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLM) based agents excel at general reasoning but often fail in specialized domains where success hinges on long-tail knowledge absent from their training data. While human experts can provide this missing knowledge, their guidance is often unstructured and unreliable, making its direct integration into an agent's plan problematic. To address this, we introduce AHCE (Active Human-Augmented Challenge Engagement), a framework for on-demand Human-AI collaboration. At its core, the Human Feedback Module (HFM) employs a learned policy to treat the human expert as an interactive reasoning tool. Extensive experiments in Minecraft demonstrate the framework's effectiveness, increasing task success rates by 32% on normal difficulty tasks and nearly 70% on highly difficult tasks, all with minimal human intervention. Our work demonstrates that successfully augmenting agents requires learning how to request expert reasoning, moving beyond simple requests for help.

Related papers

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios [49.90735676070039]
The capacity of AI agents to effectively handle tasks of increasing duration and complexity continues to grow.<n>We argue that current evaluations prioritize increasing task difficulty without sufficiently addressing the diversity of agentic tasks.<n>We propose AgentIF-OneDay, aimed at determining whether general users can utilize natural language instructions and AI agents to complete a diverse array of daily tasks.
arXiv Detail & Related papers (2026-01-28T13:49:18Z)
Training LLM Agents to Empower Humans [67.80021254324294]
We propose a new approach to tuning assistive language models based on maximizing the human's empowerment.<n>Our empowerment-maximizing method, Empower, only requires offline text data.<n>We show that agents trained with Empower increase the success rate of a simulated human programmer on challenging coding questions by an average of 192%.
arXiv Detail & Related papers (2025-10-15T16:09:33Z)
Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise [6.441011477647557]
Efficient exploration in multi-agent reinforcement learning (MARL) is a challenging problem when receiving only a team reward.<n>A powerful method to mitigate this issue involves crafting dense individual rewards to guide the agents toward efficient exploration.<n>We propose a novel framework, LIGHT, which can integrate human knowledge into MARL algorithms in an end-to-end manner.
arXiv Detail & Related papers (2025-07-25T00:59:10Z)
SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam? [51.112225746095746]
We introduce X-Master, a tool-augmented reasoning agent designed to emulate human researchers.<n>X-Masters sets a new state-of-the-art record on Humanity's Last Exam with a score of 32.1%.
arXiv Detail & Related papers (2025-07-07T17:50:52Z)
Robot-Gated Interactive Imitation Learning with Adaptive Intervention Mechanism [48.41735416075536]
Interactive Imitation Learning (IIL) allows agents to acquire desired behaviors through human interventions.<n>We propose the Adaptive Intervention Mechanism (AIM), a novel robot-gated IIL algorithm that learns an adaptive criterion for requesting human demonstrations.
arXiv Detail & Related papers (2025-06-10T18:43:26Z)
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [50.657070334404835]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments.<n>We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions.<n>Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z)
Towards Collaborative Question Answering: A Preliminary Study [63.91687114660126]
We propose CollabQA, a novel QA task in which several expert agents coordinated by a moderator work together to answer questions that cannot be answered with any single agent alone. We make a synthetic dataset of a large knowledge graph that can be distributed to experts. We show that the problem can be challenging without introducing prior to the collaboration structure, unless experts are perfect and uniform.
arXiv Detail & Related papers (2022-01-24T14:27:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.