TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned
Decision
- URL: http://arxiv.org/abs/2403.06221v1
- Date: Sun, 10 Mar 2024 13:58:38 GMT
- Title: TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned
Decision
- Authors: Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang,
Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang
- Abstract summary: Large language model (LLM) agents have been built for different tasks like web navigation and online shopping.
In this paper, we propose a novel framework (TRAD) to address these issues.
TRAD conducts Thought Retrieval, achieving step-level demonstration selection via thought matching.
Then, TRAD introduces Aligned Decision, complementing retrieved demonstration steps with their previous or subsequent steps.
- Score: 32.24857534147114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Numerous large language model (LLM) agents have been built for different
tasks like web navigation and online shopping due to LLM's wide knowledge and
text-understanding ability. Among these works, many of them utilize in-context
examples to achieve generalization without the need for fine-tuning, while few
of them have considered the problem of how to select and effectively utilize
these examples. Recently, methods based on trajectory-level retrieval with task
meta-data and using trajectories as in-context examples have been proposed to
improve the agent's overall performance in some sequential decision making
tasks. However, these methods can be problematic due to plausible examples
retrieved without task-specific state transition dynamics and long input with
plenty of irrelevant context. In this paper, we propose a novel framework
(TRAD) to address these issues. TRAD first conducts Thought Retrieval,
achieving step-level demonstration selection via thought matching, leading to
more helpful demonstrations and less irrelevant input noise. Then, TRAD
introduces Aligned Decision, complementing retrieved demonstration steps with
their previous or subsequent steps, which enables tolerance for imperfect
thought and provides a choice for balance between more context and less noise.
Extensive experiments on ALFWorld and Mind2Web benchmarks show that TRAD not
only outperforms state-of-the-art models but also effectively helps in reducing
noise and promoting generalization. Furthermore, TRAD has been deployed in
real-world scenarios of a global business insurance company and improves the
success rate of robotic process automation.
Related papers
- MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents [28.419007116364668]
MLLM agents demonstrate potential for complex embodied tasks by retrieving multimodal task-relevant trajectory data.
Current retrieval methods primarily focus on surface-level similarities of textual or visual cues in trajectories, neglecting their effectiveness for the specific task at hand.
We propose a novel method, MLLM as ReTriever (MART), which enhances the performance of embodied agents by utilizing interaction data to fine-tune an MLLM retriever.
arXiv Detail & Related papers (2024-10-04T14:10:39Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework.
At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence.
We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z) - Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL)
Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning.
We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z) - Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents [41.14201835950814]
Large language models (LLMs) have achieved success in acting as agents, which interact with environments through tools such as search engines.
Previous work has first collected interaction trajectories between LLMs and environments, using only trajectories that successfully finished the task to fine-tune smaller models.
We argue that unsuccessful trajectories offer valuable insights, and LLMs can learn from these trajectories through appropriate quality control and fine-tuning strategies.
arXiv Detail & Related papers (2024-02-18T17:10:07Z) - Misconfidence-based Demonstration Selection for LLM In-Context Learning [0.0]
In-context learning with large language models (LLMs) excels at adapting to various tasks rapidly.
Current approaches to this problem either rely on hard-to-acquire external supervision or require frequent interactions with LLMs.
We propose a new method called In-Context Reflection (ICR) to overcome these challenges.
arXiv Detail & Related papers (2024-01-12T00:11:24Z) - OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking [16.057622631156164]
Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive.
Previous studies have explored various approaches to harness the potential of Small Language Models (SLMs) as cost-effective alternatives to their larger counterparts.
This work presents a novel SLM/LLM routing framework designed to improve computational efficiency and enhance task performance.
arXiv Detail & Related papers (2023-11-16T10:30:55Z) - LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction [12.673710691468264]
We introduce the Heuristic-Driven Link-of- Analogy (HD-LoA) prompting to address the challenge of example selection.
Inspired by the analogical reasoning of human, we propose the link-of-analogy prompting, which enables LLMs to process new situations.
Experiments show that our method outperforms existing prompting methods and few-shot supervised learning methods on document-level EAE datasets.
arXiv Detail & Related papers (2023-11-11T12:05:01Z) - LASER: LLM Agent with State-Space Exploration for Web Navigation [57.802977310392755]
Large language models (LLMs) have been successfully adapted for interactive decision-making tasks like web navigation.
Previous methods implicitly assume a forward-only execution mode for the model, where they only provide oracle trajectories as in-context examples.
We propose to model the interactive task as state space exploration, where the LLM agent transitions among a pre-defined set of states by performing actions to complete the task.
arXiv Detail & Related papers (2023-09-15T05:44:08Z) - CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of
Pre-trained Language Models [59.49705076369856]
We introduce a novel framework to improve the fine-tuning phase of pre-trained language models (PLMs)
We retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to a task.
We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
arXiv Detail & Related papers (2021-02-07T09:27:26Z) - Continuous Transition: Improving Sample Efficiency for Continuous
Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition.
Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions.
To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.