Related papers: Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

URL: http://arxiv.org/abs/2401.12954v1
Date: Tue, 23 Jan 2024 18:22:19 GMT
Title: Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Authors: Mirac Suzgun, Adam Tauman Kalai
Abstract summary: We introduce meta-prompting, an effective scaffolding technique designed to enhance the functionality of language models (LMs) By employing high-level instructions, meta-prompting guides the LM to break down complex tasks into smaller, more manageable subtasks. Central to this process is the LM itself, in its role as the conductor, which ensures seamless communication and effective integration of the outputs.
Score: 15.04954445749935
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce meta-prompting, an effective scaffolding technique designed to enhance the functionality of language models (LMs). This approach transforms a single LM into a multi-faceted conductor, adept at managing and integrating multiple independent LM queries. By employing high-level instructions, meta-prompting guides the LM to break down complex tasks into smaller, more manageable subtasks. These subtasks are then handled by distinct "expert" instances of the same LM, each operating under specific, tailored instructions. Central to this process is the LM itself, in its role as the conductor, which ensures seamless communication and effective integration of the outputs from these expert models. It additionally employs its inherent critical thinking and robust verification processes to refine and authenticate the end result. This collaborative prompting approach empowers a single LM to simultaneously act as a comprehensive orchestrator and a panel of diverse experts, significantly enhancing its performance across a wide array of tasks. The zero-shot, task-agnostic nature of meta-prompting greatly simplifies user interaction by obviating the need for detailed, task-specific instructions. Furthermore, our research demonstrates the seamless integration of external tools, such as a Python interpreter, into the meta-prompting framework, thereby broadening its applicability and utility. Through rigorous experimentation with GPT-4, we establish the superiority of meta-prompting over conventional scaffolding methods: When averaged across all tasks, including the Game of 24, Checkmate-in-One, and Python Programming Puzzles, meta-prompting, augmented with a Python interpreter functionality, surpasses standard prompting by 17.1%, expert (dynamic) prompting by 17.3%, and multipersona prompting by 15.2%.

Related papers

The Future of MLLM Prompting is Adaptive: A Comprehensive Experimental Evaluation of Prompt Engineering Methods for Robust Multimodal Performance [0.393259574660092]
Multimodal Large Language Models (MLLMs) are set to transform how machines process and generate human-like responses. We present a comprehensive experimental evaluation of seven prompt engineering methods applied to 13 open-source MLLMs over 24 tasks.
arXiv Detail & Related papers (2025-04-14T12:31:39Z)
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents [63.43699771428243]
EmbodiedBench is an extensive benchmark designed to evaluate vision-driven embodied agents. We evaluated 19 leading proprietary and open-source MLLMs within EmbodiedBench. MLLMs excel at high-level tasks but struggle with low-level manipulation.
arXiv Detail & Related papers (2025-02-13T18:11:34Z)
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts [110.94724216491753]
Large Language Models (LLMs) exhibit strong generalization capabilities when prompted with language instructions and in-context demos. Various methods have been explored to automate the instruction design, but they restricted the searched prompt to one instruction. We adopt the Mixture-of-Expert paradigm and divide the problem space into a set of sub-regions. A two-phase process is developed to construct the specialized expert for each region. A region-based joint search of an instruction per expert complements the demos assigned to it, yielding a synergistic effect.
arXiv Detail & Related papers (2024-06-28T23:05:08Z)
Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model [15.558269067931374]
We propose STEVE-2, a hierarchical knowledge distillation framework for open-ended embodied tasks. After distillation, embodied agents can complete complex, open-ended tasks without additional expert guidance.
arXiv Detail & Related papers (2024-04-06T12:51:00Z)
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs. We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer. This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z)
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [59.07490387145391]
Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks. Their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language. We introduce a novel instruction tuning dataset, INTERS, encompassing 20 tasks across three fundamental IR categories.
arXiv Detail & Related papers (2024-01-12T12:10:28Z)
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning [29.88567810099265]
Multi-task learning is designed to train multiple correlated tasks simultaneously. To tackle this challenge, we integrate the decoder-free vision-language model CLIP. We propose Multi-modal Alignment Prompt (MmAP) for CLIP, which aligns text and visual modalities during fine-tuning process.
arXiv Detail & Related papers (2023-12-14T03:33:02Z)
TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation. Specifically, task decomposition, tool selection, and parameter prediction are assessed. Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z)
Meta Prompting for AI Systems [12.304069891580658]
We introduce Meta Prompting (MP), a prompting paradigm designed to enhance the utilization of large language models and AI systems. MP prioritizes structural and syntactical considerations over traditional content-centric methods. Empirical evaluations reveal that a Qwen-72B base language model equipped with Meta Prompting-without additional instruction tuning-achieves a PASS@1 accuracy of 46.3%.
arXiv Detail & Related papers (2023-11-20T01:51:13Z)
AskIt: Unified Programming Interface for Programming with Large Language Models [0.0]
Large Language Models (LLMs) exhibit a unique phenomenon known as emergent abilities, demonstrating adeptness across numerous tasks. This paper introduces AskIt, a domain-specific language specifically designed for LLMs. Across 50 tasks, AskIt generated concise prompts, achieving a 16.14 % reduction in prompt length compared to benchmarks.
arXiv Detail & Related papers (2023-08-29T21:44:27Z)
Decomposed Prompting: A Modular Approach for Solving Complex Tasks [55.42850359286304]
We propose Decomposed Prompting to solve complex tasks by decomposing them (via prompting) into simpler sub-tasks. This modular structure allows each prompt to be optimized for its specific sub-task. We show that the flexibility and modularity of Decomposed Prompting allows it to outperform prior work on few-shot prompting.
arXiv Detail & Related papers (2022-10-05T17:28:20Z)
Improving Task Generalization via Unified Schema Prompt [87.31158568180514]
Unified Prompt is a flexible and prompting method, which automatically customizes the learnable prompts for each task according to the task input schema. It models the shared knowledge between tasks, while keeping the characteristics of different task schema. The framework achieves strong zero-shot and few-shot performance on 16 unseen tasks downstream from 8 task types.
arXiv Detail & Related papers (2022-08-05T15:26:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.