Related papers: Autonomous Prompt Engineering in Large Language Models

Autonomous Prompt Engineering in Large Language Models

URL: http://arxiv.org/abs/2407.11000v1
Date: Tue, 25 Jun 2024 10:14:44 GMT
Title: Autonomous Prompt Engineering in Large Language Models
Authors: Daan Kepel, Konstantina Valogianni,
Abstract summary: This research introduces the Automatic Prompt Engineering Toolbox (APET), which enables GPT-4 to autonomously apply prompt engineering techniques. APET empowers GPT-4 to dynamically optimize prompts, resulting in substantial improvements in tasks like Word Sorting. This research represents a significant leap in AI development, presenting a robust framework for future innovations in autonomous AI systems.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Prompt engineering is a crucial yet challenging task for optimizing the performance of large language models (LLMs) on customized tasks. This pioneering research introduces the Automatic Prompt Engineering Toolbox (APET), which enables GPT-4 to autonomously apply prompt engineering techniques. By leveraging sophisticated strategies such as Expert Prompting, Chain of Thought, and Tree of Thoughts, APET empowers GPT-4 to dynamically optimize prompts, resulting in substantial improvements in tasks like Word Sorting (4.4% increase) and Geometric Shapes (6.8% increase). Despite encountering challenges in complex tasks such as Checkmate in One (-14.8%), these findings demonstrate the transformative potential of APET in automating complex prompt optimization processes without the use of external data. Overall, this research represents a significant leap in AI development, presenting a robust framework for future innovations in autonomous AI systems and highlighting the ability of GPT-4 to bring prompt engineering theory to practice. It establishes a foundation for enhancing performance in complex task performance and broadening the practical applications of these techniques in real-world scenarios.

Related papers

Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors [104.5401871607713]
This paper proposes Weakfor-Strong Harnessing (W4S), a novel framework that customizes smaller, cost-efficient language models to design and optimize for harnessing stronger models. W4S formulates design as a multi-turn markov decision process and introduces reinforcement learning for agentic workflow optimization. Empirical results demonstrate the superiority of W4S that our 7B meta-agent, trained with just one GPU hour, outperforms the strongest baseline by 2.9% 24.6% across eleven benchmarks.
arXiv Detail & Related papers (2025-04-07T07:27:31Z)
Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation with Large Language Models [5.2364456910271935]
Reinforcement Learning (RL) enables agents to autonomously optimize complex behaviors through interaction and reward signals. In this work, we propose an unsupervised pipeline leveraging GPT-4, a pre-trained LLM, to generate reward functions directly from natural language task descriptions. The rewards are used to train RL agents in simulated environments, where we formalize the reward generation process to enhance feasibility.
arXiv Detail & Related papers (2025-03-06T10:08:44Z)
A Systematic Survey of Automatic Prompt Optimization Techniques [21.95159233568761]
We present a comprehensive survey summarizing the current progress and remaining challenges in this field. We provide a formal definition of APO, a 5-part unifying framework, and then proceed to rigorously categorize all relevant works based on their salient features therein.
arXiv Detail & Related papers (2025-02-24T07:29:13Z)
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities. Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z)
On the role of Artificial Intelligence methods in modern force-controlled manufacturing robotic tasks [0.0]
AI's role in enhancing robotic manipulators is rapidly leading to significant innovations in smart manufacturing. This article is to frame these innovations in practical force-controlled applications, highlighting their necessity for maintaining high-quality production standards. The analysis concludes with a perspective on future research directions, emphasizing the need for common performance metrics to validate AI techniques.
arXiv Detail & Related papers (2024-09-25T11:29:26Z)
AI based Multiagent Approach for Requirements Elicitation and Analysis [3.9422957660677476]
This study empirically investigates the effectiveness of utilizing Large Language Models (LLMs) to automate requirements analysis tasks. We deployed four models, namely GPT-3.5, GPT-4 Omni, LLaMA3-70, and Mixtral-8B, and conducted experiments to analyze requirements on four real-world projects. Preliminary results indicate notable variations in task completion among the models.
arXiv Detail & Related papers (2024-08-18T07:23:12Z)
Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review [1.6006550105523192]
Review explores the pivotal role of prompt engineering in unleashing the capabilities of Large Language Models (LLMs) Examines both foundational and advanced methodologies of prompt engineering, including techniques such as self-consistency, chain-of-thought, and generated knowledge. Review also reflects the essential role of prompt engineering in advancing AI capabilities, providing a structured framework for future research and application.
arXiv Detail & Related papers (2023-10-23T09:15:18Z)
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) [121.42924593374127]
We analyze the latest model, GPT-4V, to deepen the understanding of LMMs. GPT-4V's unprecedented ability in processing arbitrarily interleaved multimodal inputs makes it a powerful multimodal generalist system. GPT-4V's unique capability of understanding visual markers drawn on input images can give rise to new human-computer interaction methods.
arXiv Detail & Related papers (2023-09-29T17:34:51Z)
AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline. We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters. This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z)
Can GPT-4 Perform Neural Architecture Search? [56.98363718371614]
We investigate the potential of GPT-4 to perform Neural Architecture Search (NAS) Our proposed approach, textbfGPT-4 textbfEnhanced textbfNeural archtextbfItecttextbfUre textbfSearch (GENIUS) We assess GENIUS across several benchmarks, comparing it with existing state-of-the-art NAS techniques to illustrate its effectiveness.
arXiv Detail & Related papers (2023-04-21T14:06:44Z)
Performance of ChatGPT on the US Fundamentals of Engineering Exam: Comprehensive Assessment of Proficiency and Potential Implications for Professional Environmental Engineering Practice [0.0]
This study investigates the feasibility and effectiveness of using ChatGPT, a GPT-4 based model, in achieving satisfactory performance on the Fundamentals of Engineering (FE) Environmental Exam. The findings reflect remarkable improvements in mathematical capabilities across successive iterations of ChatGPT models, showcasing their potential in solving complex engineering problems.
arXiv Detail & Related papers (2023-04-20T16:54:34Z)
Sparks of Artificial General Intelligence: Early experiments with GPT-4 [66.1188263570629]
GPT-4, developed by OpenAI, was trained using an unprecedented scale of compute and data. We demonstrate that GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more. We believe GPT-4 could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.
arXiv Detail & Related papers (2023-03-22T16:51:28Z)
GPT-4 Technical Report [116.90398195245983]
GPT-4 is a large-scale, multimodal model which can accept image and text inputs and produce text outputs. It exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers.
arXiv Detail & Related papers (2023-03-15T17:15:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.