Related papers: SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

URL: http://arxiv.org/abs/2305.17390v2
Date: Wed, 6 Dec 2023 10:07:01 GMT
Title: SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Authors: Bill Yuchen Lin, Yicheng Fu, Karina Yang, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Prithviraj Ammanabrolu, Yejin Choi, Xiang Ren
Abstract summary: We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of human cognition. The framework comprises two primary modules: the Swift module, representing fast and intuitive thinking, and the Sage module, emulating deliberate thought processes. In 30 tasks from the ScienceWorld benchmark, SwiftSage significantly outperforms other methods such as SayCan, ReAct, and Reflex.
Score: 81.9962823875981
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of human cognition, designed to excel in action planning for complex interactive reasoning tasks. SwiftSage integrates the strengths of behavior cloning and prompting large language models (LLMs) to enhance task completion performance. The framework comprises two primary modules: the Swift module, representing fast and intuitive thinking, and the Sage module, emulating deliberate thought processes. The Swift module is a small encoder-decoder LM fine-tuned on the oracle agent's action trajectories, while the Sage module employs LLMs such as GPT-4 for subgoal planning and grounding. We develop a heuristic method to harmoniously integrate the two modules, resulting in a more efficient and robust problem-solving process. In 30 tasks from the ScienceWorld benchmark, SwiftSage significantly outperforms other methods such as SayCan, ReAct, and Reflexion, demonstrating its effectiveness in solving complex interactive tasks.

Related papers

Fast-Slow-Thinking: Complex Task Solving with Large Language Models [49.98959729052245]
This paper introduces a new task decomposition method termed Fast-Slow-Thinking'' (FST) In FT, LLMs are prompted to remove the constraints of the original task, therefore simplifying it to a general and concise one. In ST, we recall the constraints removed in FT, so that LLMs can improve the answer generated in FT to meet the requirements of the original task.
arXiv Detail & Related papers (2025-04-11T16:57:36Z)
Learning to Chain Operations by Routing Information Through a Global Workspace [3.1614158472531435]
We present a model inspired by the Global Workspace Theory that integrates specialized modules to perform a sequential reasoning task. We evaluate the model's performance on a simple addition task, where two addends must be summed. Our results highlight the potential of architectures inspired by the Global Workspace Theory to enhance deep learning's reasoning capabilities.
arXiv Detail & Related papers (2025-02-28T15:30:55Z)
Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making. Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance. We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z)
AgentSquare: Automatic LLM Agent Search in Modular Design Space [16.659969168343082]
Large Language Models (LLMs) have led to a rapid growth of agentic systems capable of handling a wide range of complex tasks. We introduce a new research problem: Modularized LLM Agent Search (MoLAS)
arXiv Detail & Related papers (2024-10-08T15:52:42Z)
Iteration of Thought: Leveraging Inner Dialogue for Autonomous Large Language Model Reasoning [0.0]
Iterative human engagement is a common and effective means of leveraging the advanced language processing power of large language models (LLMs) We propose the Iteration of Thought (IoT) framework for enhancing LLM responses by generating "thought"-provoking prompts. Unlike static or semi-static approaches, IoT adapts its reasoning path dynamically, based on evolving context.
arXiv Detail & Related papers (2024-09-19T09:44:17Z)
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models [42.95876831743256]
Large language models (LLMs) have demonstrated emergent capabilities across diverse reasoning tasks via Chains-of-Thought prompting. This paper addresses the challenge of enabling LLMs to autonomously select between fast and slow inference methods. We introduce a dynamic decision-making framework that categorizes tasks into two distinct pathways: 'Fast', designated for tasks where the LLM quickly identifies a high-confidence solution, and 'Slow', allocated for tasks that the LLM perceives as complex.
arXiv Detail & Related papers (2024-07-01T06:45:13Z)
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts [21.819126948549766]
Large Language Models (LLMs) have become increasingly capable of handling diverse tasks with the aid of well-crafted prompts. APPL acts as a bridge between computer programs and LLMs, allowing seamless embedding of prompts into Python functions.
arXiv Detail & Related papers (2024-06-19T02:29:59Z)
RL-GPT: Integrating Reinforcement Learning and Code-as-policy [82.1804241891039]
We introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent. The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks. This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline.
arXiv Detail & Related papers (2024-02-29T16:07:22Z)
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules [51.82044734879657]
We propose CodeChain, a novel framework for inference that elicits modularized code generation through a chain of self-revisions. We find that CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests.
arXiv Detail & Related papers (2023-10-13T10:17:48Z)
Improving Planning with Large Language Models: A Modular Agentic Architecture [7.63815864256878]
Large language models (LLMs) often struggle with tasks that require multi-step reasoning or goal-directed planning. We propose an agentic architecture, the Modular Agentic Planner (MAP), in which planning is accomplished via the recurrent interaction of specialized modules. We find that MAP yields significant improvements over both standard LLM methods.
arXiv Detail & Related papers (2023-09-30T00:10:14Z)
Low-code LLM: Graphical User Interface over Large Language Models [115.08718239772107]
This paper introduces a novel human-LLM interaction framework, Low-code LLM. It incorporates six types of simple low-code visual programming interactions to achieve more controllable and stable responses. We highlight three advantages of the low-code LLM: user-friendly interaction, controllable generation, and wide applicability.
arXiv Detail & Related papers (2023-04-17T09:27:40Z)
Decomposed Prompting: A Modular Approach for Solving Complex Tasks [55.42850359286304]
We propose Decomposed Prompting to solve complex tasks by decomposing them (via prompting) into simpler sub-tasks. This modular structure allows each prompt to be optimized for its specific sub-task. We show that the flexibility and modularity of Decomposed Prompting allows it to outperform prior work on few-shot prompting.
arXiv Detail & Related papers (2022-10-05T17:28:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.