Feedback Descent: Open-Ended Text Optimization via Pairwise Comparison
- URL: http://arxiv.org/abs/2511.07919v1
- Date: Wed, 12 Nov 2025 01:28:35 GMT
- Title: Feedback Descent: Open-Ended Text Optimization via Pairwise Comparison
- Authors: Yoonho Lee, Joseph Boen, Chelsea Finn,
- Abstract summary: Feedback Descent is a framework that optimize text artifacts -- prompts, code, and molecules -- through structured textual feedback.<n>We show that in-context learning can transform structured feedback into gradient-like directional information, enabling targeted edits.<n>In the DOCKSTRING molecule discovery benchmark, Feedback Descent identifies novel drug-like molecules surpassing the $99.9$th percentile of a database with more than $260,000$ compounds across six protein targets.
- Score: 48.89195616081196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce \textit{Feedback Descent}, a framework that optimizes text artifacts -- prompts, code, and molecules -- through structured textual feedback, rather than relying solely on scalar rewards. By preserving detailed critiques instead of compressing them to binary preferences, Feedback Descent widens the information bottleneck in preference learning, enabling directed optimization in text space rather than weight space. We show that in-context learning can transform structured feedback into gradient-like directional information, enabling targeted edits. Unlike prior approaches that collapse judgments into single bits, our evaluators pair each comparison with textual feedback, which functions as high-bandwidth supervision. The iteration loop is done purely at inference time, without modifying any model weights, and is task-agnostic. We evaluate Feedback Descent on three diverse domains and find that it outperforms state-of-the-art prompt optimization (GEPA), reinforcement learning methods (GRPO, REINVENT), and even specialized graph-based molecular optimizers. In the DOCKSTRING molecule discovery benchmark, Feedback Descent identifies novel drug-like molecules surpassing the $99.9$th percentile of a database with more than $260{,}000$ compounds across six protein targets.
Related papers
- TextBFGS: Quasi-Newton Optimization for Discrete Executable Text via Gradient-Operator Retrieval [38.3962427355446]
We introduce TextBFGS, a second-order framework to implement a Quasi-Newton optimization method for discrete text.<n>TextBFGS approximates the inverse Hessian matrix by retrieving gradient-Operators from the memory of pre-learned successful trajectories.<n>It achieves superior pass rates with fewer model calls and exhibits strong cross-task transferability.
arXiv Detail & Related papers (2026-01-20T05:45:56Z) - LRANet++: Low-Rank Approximation Network for Accurate and Efficient Text Spotting [118.93173826110815]
We propose a novel parameterized text shape method based on low-rank approximation for precise detection.<n>By exploiting the inherent shape correlation among different text contours, our method achieves consistency and compactness in shape representation.<n>We integrate the enhanced detection module with a lightweight recognition branch to form an end-to-end text spotting framework, termed LRANet++.
arXiv Detail & Related papers (2025-11-08T03:08:03Z) - Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives [93.31112073070906]
Existing methods rely on large-scale pre-training to improve video retrieval performance.<n>We propose a novel framework to learn fine-grained features for better alignment.<n>We also introduce an inference pipeline to improve performance without additional training.
arXiv Detail & Related papers (2025-08-20T16:03:56Z) - Towards Bridging Review Sparsity in Recommendation with Textual Edge Graph Representation [28.893058826607735]
We propose a unified framework that imputes missing reviews by jointly modeling semantic and structural signals.<n>Experiments on the Amazon and Goodreads datasets show that TWISTER consistently outperforms traditional numeric, graph-based, and LLM baselines.<n>In summary, TWISTER generates reviews that are more helpful, authentic, and specific, while smoothing structural signals for improved recommendations.
arXiv Detail & Related papers (2025-08-02T00:53:40Z) - Text2Grad: Reinforcement Learning from Natural Language Feedback [32.59003667154527]
We introduce Text2Grad, a-grained reinforcement paradigm that turns free-form textual feedback into span-level gradients.<n>Our results demonstrate that natural-language feedback, when converted to gradients, is a powerful signal for finegrained policy optimization.
arXiv Detail & Related papers (2025-05-28T13:23:49Z) - Fast Prompt Alignment for Text-to-Image Generation [28.66112701912297]
This paper introduces Fast Prompt Alignment (FPA), a prompt optimization framework that leverages a one-pass approach.<n>FPA uses large language models (LLMs) for single-iteration prompt paraphrasing, followed by fine-tuning or in-context learning with optimized prompts.<n>FPA achieves competitive text-image alignment scores at a fraction of the processing time.
arXiv Detail & Related papers (2024-12-11T18:58:41Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - TextGrad: Automatic "Differentiation" via Text [32.94896315864364]
TextGrad backpropagates textual feedback to improve individual components of a compound AI system.
It works out-of-the-box for a variety of tasks, where the users only provide the objective function without tuning components or prompts of the framework.
We showcase TextGrad's effectiveness and generality across a diverse range of applications, from question answering and molecule optimization to radiotherapy treatment planning.
arXiv Detail & Related papers (2024-06-11T17:32:21Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Three ways to improve feature alignment for open vocabulary detection [88.65076922242184]
Key problem in zero-shot open vocabulary detection is how to align visual and text features, so that the detector performs well on unseen classes.
Previous approaches train the feature pyramid and detection head from scratch, which breaks the vision-text feature alignment established during pretraining.
We propose three methods to alleviate these issues. Firstly, a simple scheme is used to augment the text embeddings which prevents overfitting to a small number of classes seen during training.
Secondly, the feature pyramid network and the detection head are modified to include trainable shortcuts.
Finally, a self-training approach is used to leverage a larger corpus of
arXiv Detail & Related papers (2023-03-23T17:59:53Z) - Momentum Decoding: Open-ended Text Generation As Graph Exploration [49.812280360794894]
Open-ended text generation with autoregressive language models (LMs) is one of the core tasks in natural language processing.
We formulate open-ended text generation from a new perspective, i.e., we view it as an exploration process within a directed graph.
We propose a novel decoding method -- textitmomentum decoding -- which encourages the LM to explore new nodes outside the current graph.
arXiv Detail & Related papers (2022-12-05T11:16:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.