Related papers: TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

URL: http://arxiv.org/abs/2403.06221v1
Date: Sun, 10 Mar 2024 13:58:38 GMT
Title: TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Authors: Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang
Abstract summary: Large language model (LLM) agents have been built for different tasks like web navigation and online shopping. In this paper, we propose a novel framework (TRAD) to address these issues. TRAD conducts Thought Retrieval, achieving step-level demonstration selection via thought matching. Then, TRAD introduces Aligned Decision, complementing retrieved demonstration steps with their previous or subsequent steps.
Score: 32.24857534147114
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Numerous large language model (LLM) agents have been built for different tasks like web navigation and online shopping due to LLM's wide knowledge and text-understanding ability. Among these works, many of them utilize in-context examples to achieve generalization without the need for fine-tuning, while few of them have considered the problem of how to select and effectively utilize these examples. Recently, methods based on trajectory-level retrieval with task meta-data and using trajectories as in-context examples have been proposed to improve the agent's overall performance in some sequential decision making tasks. However, these methods can be problematic due to plausible examples retrieved without task-specific state transition dynamics and long input with plenty of irrelevant context. In this paper, we propose a novel framework (TRAD) to address these issues. TRAD first conducts Thought Retrieval, achieving step-level demonstration selection via thought matching, leading to more helpful demonstrations and less irrelevant input noise. Then, TRAD introduces Aligned Decision, complementing retrieved demonstration steps with their previous or subsequent steps, which enables tolerance for imperfect thought and provides a choice for balance between more context and less noise. Extensive experiments on ALFWorld and Mind2Web benchmarks show that TRAD not only outperforms state-of-the-art models but also effectively helps in reducing noise and promoting generalization. Furthermore, TRAD has been deployed in real-world scenarios of a global business insurance company and improves the success rate of robotic process automation.

Related papers

DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development. We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents. We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents [28.419007116364668]
MLLM agents demonstrate potential for complex embodied tasks by retrieving multimodal task-relevant trajectory data. Current retrieval methods primarily focus on surface-level similarities of textual or visual cues in trajectories, neglecting their effectiveness for the specific task at hand. We propose a novel method, MLLM as ReTriever (MART), which enhances the performance of embodied agents by utilizing interaction data to fine-tune an MLLM retriever.
arXiv Detail & Related papers (2024-10-04T14:10:39Z)
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework. At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence. We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z)
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset [50.36095192314595]
Large Language Models (LLMs) function as conscious agents with generalizable reasoning capabilities. This ability remains underexplored due to the complexity of modeling infinite possible changes in an event. We introduce the first-ever benchmark, MARS, comprising three tasks corresponding to each step.
arXiv Detail & Related papers (2024-06-04T08:35:04Z)
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL) Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning. We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z)
Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents [41.14201835950814]
Large language models (LLMs) have achieved success in acting as agents, which interact with environments through tools such as search engines. Previous work has first collected interaction trajectories between LLMs and environments, using only trajectories that successfully finished the task to fine-tune smaller models. We argue that unsuccessful trajectories offer valuable insights, and LLMs can learn from these trajectories through appropriate quality control and fine-tuning strategies.
arXiv Detail & Related papers (2024-02-18T17:10:07Z)
Misconfidence-based Demonstration Selection for LLM In-Context Learning [0.0]
In-context learning with large language models (LLMs) excels at adapting to various tasks rapidly. Current approaches to this problem either rely on hard-to-acquire external supervision or require frequent interactions with LLMs. We propose a new method called In-Context Reflection (ICR) to overcome these challenges.
arXiv Detail & Related papers (2024-01-12T00:11:24Z)
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking [16.057622631156164]
Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive. Previous studies have explored various approaches to harness the potential of Small Language Models (SLMs) as cost-effective alternatives to their larger counterparts. This work presents a novel SLM/LLM routing framework designed to improve computational efficiency and enhance task performance.
arXiv Detail & Related papers (2023-11-16T10:30:55Z)
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction [12.673710691468264]
We introduce the Heuristic-Driven Link-of- Analogy (HD-LoA) prompting to address the challenge of example selection. Inspired by the analogical reasoning of human, we propose the link-of-analogy prompting, which enables LLMs to process new situations. Experiments show that our method outperforms existing prompting methods and few-shot supervised learning methods on document-level EAE datasets.
arXiv Detail & Related papers (2023-11-11T12:05:01Z)
LASER: LLM Agent with State-Space Exploration for Web Navigation [57.802977310392755]
Large language models (LLMs) have been successfully adapted for interactive decision-making tasks like web navigation. Previous methods implicitly assume a forward-only execution mode for the model, where they only provide oracle trajectories as in-context examples. We propose to model the interactive task as state space exploration, where the LLM agent transitions among a pre-defined set of states by performing actions to complete the task.
arXiv Detail & Related papers (2023-09-15T05:44:08Z)
CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models [59.49705076369856]
We introduce a novel framework to improve the fine-tuning phase of pre-trained language models (PLMs) We retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to a task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
arXiv Detail & Related papers (2021-02-07T09:27:26Z)
Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.