Related papers: Modular Prompt Optimization: Optimizing Structured Prompts with Section-Local Textual Gradients

Modular Prompt Optimization: Optimizing Structured Prompts with Section-Local Textual Gradients

URL: http://arxiv.org/abs/2601.04055v1
Date: Wed, 07 Jan 2026 16:20:08 GMT
Title: Modular Prompt Optimization: Optimizing Structured Prompts with Section-Local Textual Gradients
Authors: Prith Sharma, Austin Z. Henley,
Abstract summary: We introduce a schema-based prompt optimization framework that treats prompts as structured objects composed of fixed semantic sections.<n>We evaluate MPO on two reasoning benchmarks, ARC-Challenge and MMLU, using LLaMA-3 8B-Instruct and Mistral-7B-Instruct as solver models.
Score: 0.8604557306886812
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt quality plays a central role in controlling the behavior, reliability, and reasoning performance of large language models (LLMs), particularly for smaller open-source instruction-tuned models that depend heavily on explicit structure. While recent work has explored automatic prompt optimization using textual gradients and self-refinement, most existing methods treat prompts as monolithic blocks of text, making it difficult to localize errors, preserve critical instructions, or prevent uncontrolled prompt growth. We introduce Modular Prompt Optimization (MPO), a schema-based prompt optimization framework that treats prompts as structured objects composed of fixed semantic sections, including system role, context, task description, constraints, and output format. MPO applies section-local textual gradients, generated by a critic language model, to refine each section independently while keeping the overall prompt schema fixed. Section updates are consolidated through de-duplication to reduce redundancy and interference between components, yielding an interpretable and robust optimization process. We evaluate MPO on two reasoning benchmarks, ARC-Challenge and MMLU, using LLaMA-3 8B-Instruct and Mistral-7B-Instruct as solver models. Across both benchmarks and models, MPO consistently outperforms an untuned structured prompt and the TextGrad baseline, achieving substantial accuracy gains without modifying model parameters or altering prompt structure. These results demonstrate that maintaining a fixed prompt schema while applying localized, section-wise optimization is an effective and practical approach for improving reasoning performance in small open-source LMs.

Related papers

Prompt Optimization Via Diffusion Language Models [73.9599434962714]
We propose a diffusion-based framework for prompt optimization.<n>Our method enables flexible, span-level prompt updates without requiring access or modifying the downstream language model.<n>We show that moderate diffusion step counts provide the best balance between refinement quality and stability.
arXiv Detail & Related papers (2026-01-30T00:00:54Z)
Learning from Prompt itself: the Hierarchical Attribution Prompt Optimization [13.8868879878572]
A structured optimization approach requires automated or semi-automated procedures to develop improved prompts.<n>Current prompt optimization methods often induce prompt drift, where new prompts fix prior failures but impair performance on previously successful tasks.<n>This study proposes the Hierarchical Prompt Optimization framework, which introduces three innovations: (1) a dynamic attribution mechanism targeting error patterns in training data and prompting history, (2) semantic-unit optimization for editing functional prompt segments, and (3) multimodal-friendly progression supporting both end-to-end LLM and LLM-MLLM.
arXiv Detail & Related papers (2026-01-06T03:34:17Z)
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models [18.829572148850563]
We introduce ACE (Agentic Context Engineering), a framework that treats contexts as evolving playbooks.<n>Across agent and domain-specific benchmarks, ACE consistently outperforms strong baselines.<n> ACE could adapt effectively without labeled supervision and instead by leveraging natural execution feedback.
arXiv Detail & Related papers (2025-10-06T09:30:18Z)
Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates [56.73907811047611]
Large language models (LLMs) have demonstrated strong reasoning and tool-use capabilities.<n>LLMs often fail in real-world tool-interactions due to incorrect parameterization, poor tool selection, or misinterpretation of user intent.<n>We introduce a curriculum-inspired framework that leverages structured reasoning templates to guide LLMs through more deliberate step-by-step instructions for generating function callings.
arXiv Detail & Related papers (2025-09-22T17:55:14Z)
Reflection-Enhanced Meta-Optimization Integrating TextGrad-style Prompt Optimization with Memory-Driven Self-Evolution [0.0]
We propose a framework that integrates a memory-augmented Reflection RetrievalRAG module and a Self-Adaptive meta-controller.<n>REMO achieves more stable and robust tuning, albeit at the cost of increased computational overhead.
arXiv Detail & Related papers (2025-08-26T07:25:45Z)
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models [72.4723784999432]
Large Language Models (LLMs) perform best with well-crafted prompts, yet prompt engineering remains manual, inconsistent, and inaccessible to non-experts.<n>Promptomatix transforms natural language task descriptions into high-quality prompts without requiring manual tuning or domain expertise.<n>System analyzes user intent, generates synthetic training data, selects prompting strategies, and refines prompts using cost-aware objectives.
arXiv Detail & Related papers (2025-07-17T18:18:20Z)
REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving [6.19179006129561]
We introduce a novel compilation framework (dubbed Reasoning) that formulates optimization as a sequential, context-aware decision process.<n>Our approach demonstrates the potential of LLM-guided reasoning to transform the landscape of compiler optimization.
arXiv Detail & Related papers (2025-06-02T07:02:46Z)
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models [48.15777554876988]
Traditional alignment methods often require retraining large pretrained models.<n>We propose a novel textitResidual Alignment Model (textitRAM) that formalizes the alignment process as a type of importance sampling.<n>We develop a resampling algorithm with iterative token-level decoding to address the common first-token latency issue in comparable methods.
arXiv Detail & Related papers (2025-05-26T08:53:02Z)
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers [52.17222304851524]
We introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning.<n>By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models.<n> GReaTer consistently outperforms previous state-of-the-art prompt optimization methods.
arXiv Detail & Related papers (2024-12-12T20:59:43Z)
SCULPT: Systematic Tuning of Long Prompts [17.00433893207345]
We propose a framework that treats prompt optimization as a hierarchical tree refinement problem.<n>SCULPT represents prompts as tree structures, enabling targeted modifications while preserving contextual integrity.<n>It produces more stable and interpretable prompt modifications, ensuring better generalization across tasks.
arXiv Detail & Related papers (2024-10-28T07:10:10Z)
In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.