Related papers: Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

URL: http://arxiv.org/abs/2407.10930v1
Date: Mon, 15 Jul 2024 17:30:31 GMT
Title: Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together
Authors: Dilara Soylu, Christopher Potts, Omar Khattab,
Abstract summary: We evaluate approximate optimization strategies in which we bootstrap training labels for all pipeline stages and use these to optimize the pipeline's prompts and fine-tune its weights alternatingly. Simple approaches for optimizing the prompts and weights together outperform directly optimizing weights alone and prompts alone by up to 65% and 5%, respectively, on average across LMs and tasks.
Score: 21.797319884895025
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Natural Language Processing (NLP) systems are increasingly taking the form of multi-stage pipelines involving multiple distinct language models (LMs) and prompting strategies. Here we address the question of how to fine-tune such systems to improve their performance. We cast this as a problem of optimizing the underlying LM weights and the prompting strategies together, and consider a challenging but highly realistic scenario in which we have no gold labels for any intermediate stages in the pipeline. To address this challenge, we evaluate approximate optimization strategies in which we bootstrap training labels for all pipeline stages and use these to optimize the pipeline's prompts and fine-tune its weights alternatingly. In experiments with multi-hop QA, mathematical reasoning, and feature-based classification, we find that simple approaches for optimizing the prompts and weights together outperform directly optimizing weights alone and prompts alone by up to 65% and 5%, respectively, on average across LMs and tasks. We will release our new optimizers in DSPy at http://dspy.ai

Related papers

Multi-module GRPO: Composing Policy Gradients and Prompt Optimization for Language Model Programs [77.22973302887435]
Group Relative Policy Optimization (GRPO) has proven to be an effective tool for post-training language models (LMs)<n>We present mmGRPO, a simple multi-module of GRPO that groups LM calls by module across rollouts and handles variable-length and interrupted trajectories.<n>We find that mmGRPO, composed with automatic prompt optimization, improves accuracy by 11% on average across classification, many-hop search, and privacy-preserving delegation tasks.
arXiv Detail & Related papers (2025-08-06T17:28:31Z)
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition [95.54406667705999]
Pangu Embedded is an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs)<n>It addresses the significant computational costs and inference latency challenges prevalent in existing reasoning-optimized LLMs.<n>It delivers rapid responses and state-of-the-art reasoning quality within a single, unified model architecture.
arXiv Detail & Related papers (2025-05-28T14:03:02Z)
Optimizing Model Selection for Compound AI Systems [76.69936664916061]
We propose an efficient framework for model selection in compound systems. It iteratively selects one module and allocates to it the model with the highest module-wise performance. It confers 5%-70% accuracy gains compared to using the same LLM for all modules.
arXiv Detail & Related papers (2025-02-20T18:36:25Z)
Using Large Language Models for Parametric Shape Optimization [2.464331481632096]
We develop an optimization framework, LLM-PSO, to determine the optimal shape of parameterized engineering designs. Our preliminary exploration may inspire further investigations into harnessing LLMs for shape optimization and engineering design more broadly.
arXiv Detail & Related papers (2024-12-11T03:35:38Z)
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization [65.64108848398696]
We introduce a preference optimization process to enhance the multimodal reasoning capabilities of MLLMs. We develop a simple yet effective method, termed Mixed Preference Optimization (MPO), which boosts multimodal CoT performance. Our model, InternVL2-8B-MPO, achieves an accuracy of 67.0 on MathVista, outperforming InternVL2-8B by 8.7 points and achieving performance comparable to the 10x larger InternVL2-76B.
arXiv Detail & Related papers (2024-11-15T18:59:27Z)
LLM-based Optimization of Compound AI Systems: A Survey [64.39860384538338]
In a compound AI system, components such as an LLM call, a retriever, a code interpreter, or tools are interconnected. Recent advancements enable end-to-end optimization of these parameters using an LLM. This paper presents a survey of the principles and emerging trends in LLM-based optimization of compound AI systems.
arXiv Detail & Related papers (2024-10-21T18:06:25Z)
Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs [40.159064885288245]
We study prompt optimization for Language Model Programs. We factorize our problem into optimizing the free-form instructions and few-shot demonstrations of every module. We develop MIPRO, a novel algorithm for optimizing LM programs.
arXiv Detail & Related papers (2024-06-17T16:12:03Z)
Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models. We learn the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model. Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU.
arXiv Detail & Related papers (2024-06-15T09:31:03Z)
Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning [69.95292905263393]
We show that gradient-based optimization and large language models (MsLL) are complementary to each other, suggesting a collaborative optimization approach. Our code is released at https://www.guozix.com/guozix/LLM-catalyst.
arXiv Detail & Related papers (2024-05-30T06:24:14Z)
Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers [108.72225067368592]
We propose a novel perspective to investigate the design of large language models (LLMs)-based prompts. We identify two pivotal factors in model parameter learning: update direction and update method. In particular, we borrow the theoretical framework and learning methods from gradient-based optimization to design improved strategies.
arXiv Detail & Related papers (2024-02-27T15:05:32Z)
Large Language Models as Optimizers [106.52386531624532]
We propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as prompts. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values. We demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
arXiv Detail & Related papers (2023-09-07T00:07:15Z)
Multilevel leapfrogging initialization for quantum approximate optimization algorithm [3.126276325914251]
Multilevel Leapfrogging Interpolation (MLI) strategy is proposed to reduce running costs for deep quantum algorithms. Results show that MLI can achieve the same quasi-optima as INTERP, consuming only 1/2 of the running costs required by INTERP. greedy-MLI has better stability (i.e., a higher average approximation ratio) than INTERP and MLI beyond obtaining the same quasi-optima.
arXiv Detail & Related papers (2023-06-12T09:32:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.