Related papers: QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction

QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction

URL: http://arxiv.org/abs/2403.11886v2
Date: Thu, 13 Jun 2024 13:18:43 GMT
Title: QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction
Authors: Xiang Huang, Sitao Cheng, Shanshan Huang, Jiayu Shen, Yong Xu, Chaoyun Zhang, Yuzhong Qu,
Abstract summary: We introduce an environmental feedback-based self-correction method called ERASER. Experimental results demonstrate that QueryAgent notably outperforms all previous few-shot methods. Our approach exhibits superiority in terms of efficiency, including runtime, query overhead, and API invocation costs.
Score: 18.383499080327542
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Employing Large Language Models (LLMs) for semantic parsing has achieved remarkable success. However, we find existing methods fall short in terms of reliability and efficiency when hallucinations are encountered. In this paper, we address these challenges with a framework called QueryAgent, which solves a question step-by-step and performs step-wise self-correction. We introduce an environmental feedback-based self-correction method called ERASER. Unlike traditional approaches, ERASER leverages rich environmental feedback in the intermediate steps to perform selective and differentiated self-correction only when necessary. Experimental results demonstrate that QueryAgent notably outperforms all previous few-shot methods using only one example on GrailQA and GraphQ by 7.0 and 15.0 F1. Moreover, our approach exhibits superiority in terms of efficiency, including runtime, query overhead, and API invocation costs. By leveraging ERASER, we further improve another baseline (i.e., AgentBench) by approximately 10 points, revealing the strong transferability of our approach.

Related papers

Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments [55.044159987218436]
Large language models (LLMs) have demonstrated strong planning and decision-making capabilities in complex embodied environments.<n>We take a first step toward exploring the early-exit behavior for LLM-based agents.
arXiv Detail & Related papers (2025-05-23T08:23:36Z)
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning [45.37734114816888]
We present ConvSearch-R1, a framework that eliminates dependency on external rewrite supervision by leveraging reinforcement learning to optimize reformulation directly through retrieval signals.<n>Our novel two-stage approach combines Self-Driven Policy Warm-Up to address the cold-start problem through retrieval-guided self-distillation, followed by Retrieval-Guided Reinforcement Learning with a specially designed rank-incentive reward shaping mechanism that addresses the sparsity issue in conventional retrieval metrics.
arXiv Detail & Related papers (2025-05-21T17:27:42Z)
Leveraging LLM Inconsistency to Boost Pass@k Performance [3.797421474324735]
Large language models (LLMs) achieve impressive abilities in numerous domains, but exhibit inconsistent performance in response to minor input changes.<n>We introduce a novel method for leveraging models' inconsistency to boost Pass@k performance.<n>Specifically, we present a "Variator" agent that generates k variants of a given task and submits one candidate solution for each one.
arXiv Detail & Related papers (2025-05-19T10:22:04Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models [74.40683913645731]
Zero-shot multi-label recognition (MLR) with Vision-Language Models (VLMs) faces significant challenges without training data, model tuning, or architectural modifications. Our work proposes a novel solution treating VLMs as black boxes, leveraging scores without training data or ground truth. Analysis of these prompt scores reveals VLM biases and AND''/OR' signal ambiguities, notably that maximum scores are surprisingly suboptimal compared to second-highest scores.
arXiv Detail & Related papers (2025-02-24T07:15:05Z)
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search [89.97082652805904]
We propose QLASS (Q-guided Language Agent Stepwise Search), to automatically generate annotations by estimating Q-values. With the stepwise guidance, we propose a Q-guided generation strategy to enable language agents to better adapt to long-term value. We empirically demonstrate that QLASS can lead to more effective decision making through qualitative analysis.
arXiv Detail & Related papers (2025-02-04T18:58:31Z)
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training [18.896813839389893]
We propose an iterative self-training framework, Agent-R, that enables language Agent to Reflect on the fly. Unlike traditional methods that reward or penalize actions based on correctness, Agent-R leverages MCTS to construct training data that recover correct trajectories from erroneous ones. Our findings demonstrate that Agent-R continuously improves the model's ability to recover from errors and enables timely error correction.
arXiv Detail & Related papers (2025-01-20T11:46:04Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework. At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence. We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z)
Self-Supervised Inference of Agents in Trustless Environments [44.99833362998488]
We propose a novel approach where agents can form swarms to produce high-quality responses effectively. This is accomplished by utilizing agents capable of data inference and ranking. We show that our approach is an order of magnitude faster than other trustless inference strategies reaching less than 125 ms validation latency.
arXiv Detail & Related papers (2024-09-12T20:32:07Z)
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks. This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics. We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z)
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents [44.34340798542]
Large Language Models (LLMs) have shown remarkable capabilities in natural language tasks requiring complex reasoning. Traditional supervised pre-training on static datasets falls short in enabling autonomous agent capabilities. We propose a framework that combines guided Monte Carlo Tree Search (MCTS) search with a self-critique mechanism and iterative fine-tuning on agent interactions.
arXiv Detail & Related papers (2024-08-13T20:52:13Z)
On Speeding Up Language Model Evaluation [48.51924035873411]
Development of prompt-based methods with Large Language Models (LLMs) requires making numerous decisions. We propose a novel method to address this challenge. We show that it can identify the top-performing method using only 5-15% of the typically needed resources.
arXiv Detail & Related papers (2024-07-08T17:48:42Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering [28.18555591429343]
We propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP) Instead of adding pointer heads to PLMs, we transform the task into a non-autoregressive Masked Language Modeling (MLM) generation problem. Our method consistently outperforms state-of-the-art approaches in few-shot settings by a large margin.
arXiv Detail & Related papers (2022-05-06T08:31:02Z)
Confidence-Aware Active Feedback for Efficient Instance Search [21.8172170825049]
Relevance feedback is widely used in instance search (INS) tasks to further refine imperfect ranking results. We propose a confidence-aware active feedback (CAAF) method that can efficiently select the most valuable feedback candidates. In particular, CAAF outperforms the first-place record in the public large-scale video INS evaluation of TRECVID 2021.
arXiv Detail & Related papers (2021-10-23T16:14:03Z)
BERT Loses Patience: Fast and Robust Inference with Early Exit [91.26199404912019]
We propose Patience-based Early Exit as a plug-and-play technique to improve the efficiency and robustness of a pretrained language model. Our approach improves inference efficiency as it allows the model to make a prediction with fewer layers.
arXiv Detail & Related papers (2020-06-07T13:38:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.