Related papers: Optimizing Prompts for Large Language Models: A Causal Approach

Optimizing Prompts for Large Language Models: A Causal Approach

URL: http://arxiv.org/abs/2602.01711v1
Date: Mon, 02 Feb 2026 06:37:11 GMT
Title: Optimizing Prompts for Large Language Models: A Causal Approach
Authors: Wei Chen, Yanbin Fang, Shuran Fu, Fasheng Xu, Xuan Wei,
Abstract summary: We propose a framework that reframes prompt design as a problem of causal estimation.<n>CPO consistently outperforms human-engineered prompts and state-of-the-art automateds.
Score: 8.091354120300972
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) are increasingly embedded in enterprise workflows, yet their performance remains highly sensitive to prompt design. Automatic Prompt Optimization (APO) seeks to mitigate this instability, but existing approaches face two persistent challenges. First, commonly used prompt strategies rely on static instructions that perform well on average but fail to adapt to heterogeneous queries. Second, more dynamic approaches depend on offline reward models that are fundamentally correlational, confounding prompt effectiveness with query characteristics. We propose Causal Prompt Optimization (CPO), a framework that reframes prompt design as a problem of causal estimation. CPO operates in two stages. First, it learns an offline causal reward model by applying Double Machine Learning (DML) to semantic embeddings of prompts and queries, isolating the causal effect of prompt variations from confounding query attributes. Second, it utilizes this unbiased reward signal to guide a resource-efficient search for query-specific prompts without relying on costly online evaluation. We evaluate CPO across benchmarks in mathematical reasoning, visualization, and data analytics. CPO consistently outperforms human-engineered prompts and state-of-the-art automated optimizers. The gains are driven primarily by improved robustness on hard queries, where existing methods tend to deteriorate. Beyond performance, CPO fundamentally reshapes the economics of prompt optimization: by shifting evaluation from real-time model execution to an offline causal model, it enables high-precision, per-query customization at a fraction of the inference cost required by online methods. Together, these results establish causal inference as a scalable foundation for reliable and cost-efficient prompt optimization in enterprise LLM deployments.

Related papers

Learning from Prompt itself: the Hierarchical Attribution Prompt Optimization [13.8868879878572]
A structured optimization approach requires automated or semi-automated procedures to develop improved prompts.<n>Current prompt optimization methods often induce prompt drift, where new prompts fix prior failures but impair performance on previously successful tasks.<n>This study proposes the Hierarchical Prompt Optimization framework, which introduces three innovations: (1) a dynamic attribution mechanism targeting error patterns in training data and prompting history, (2) semantic-unit optimization for editing functional prompt segments, and (3) multimodal-friendly progression supporting both end-to-end LLM and LLM-MLLM.
arXiv Detail & Related papers (2026-01-06T03:34:17Z)
Better by Comparison: Retrieval-Augmented Contrastive Reasoning for Automatic Prompt Optimization [6.3914079241545885]
We present Contrastive Reasoning Prompt Optimization (CRPO), a novel framework that formulates prompt optimization as a retrieval-augmented reasoning process.<n>Our approach retrieves top k reference prompt-response pairs from the HelpSteer2 dataset.<n>By explicitly contrasting high and low quality exemplars, CRPO enables the model to deduce why certain prompts succeed while others fail.
arXiv Detail & Related papers (2025-09-02T08:45:29Z)
GreenTEA: Gradient Descent with Topic-modeling and Evolutionary Auto-prompting [2.085792950847639]
GreenTEA is an agentic workflow for automatic prompt optimization.<n>It balances candidate exploration and knowledge exploitation.<n>It iteratively refines prompts based on feedback from error samples.
arXiv Detail & Related papers (2025-08-12T06:48:30Z)
Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation [54.61034867177997]
Caching inference responses allows them to be retrieved without another forward pass through the Large Language Models.<n>Traditional exact-match caching overlooks the semantic similarity between queries, leading to unnecessary recomputation.<n>We present a principled, learning-based framework for semantic cache eviction under unknown query and cost distributions.
arXiv Detail & Related papers (2025-08-11T06:53:27Z)
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models? [65.18157595903124]
This work investigates iterative approximate evaluation for arbitrary prompts.<n>It introduces Model Predictive Prompt Selection (MoPPS), a Bayesian risk-predictive framework.<n>MoPPS reliably predicts prompt difficulty and accelerates training with significantly reduced rollouts.
arXiv Detail & Related papers (2025-07-07T03:20:52Z)
Accelerating RL for LLM Reasoning with Optimal Advantage Regression [52.0792918455501]
We propose a novel two-stage policy optimization framework that directly approximates the optimal advantage function.<n>$A$*-PO achieves competitive performance across a wide range of mathematical reasoning benchmarks.<n>It reduces training time by up to 2$times$ and peak memory usage by over 30% compared to PPO, GRPO, and REBEL.
arXiv Detail & Related papers (2025-05-27T03:58:50Z)
Semantic-Preserving Adversarial Attacks on LLMs: An Adaptive Greedy Binary Search Approach [15.658579092368981]
Large Language Models (LLMs) increasingly rely on automatic prompt engineering in graphical user interfaces (GUIs) to refine user inputs and enhance response accuracy.<n>We propose the Adaptive Greedy Binary Search (AGBS) method, which simulates common prompt optimization mechanisms while preserving semantic stability.
arXiv Detail & Related papers (2025-05-26T15:41:06Z)
Autoformulation of Mathematical Optimization Models Using LLMs [50.030647274271516]
This paper approaches the problem of $textitautoformulation$: the automated creation of solver-ready optimization models from natural language problem descriptions.<n>We identify three core challenges of autoformulation: $textit(1)$ the vast, problem-dependent hypothesis space, and $textit(2)$ efficient and diverse exploration of this space under uncertainty.<n>We present a novel method leveraging $textitLarge Language Models$ with $textitMonte-Carlo Tree Search$, exploiting the hierarchical nature of optimization modeling to generate and systematically explore possible formulations
arXiv Detail & Related papers (2024-11-03T20:41:38Z)
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.<n>We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.<n> Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z)
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization. We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.