Large language models as uncertainty-calibrated optimizers for experimental discovery
- URL: http://arxiv.org/abs/2504.06265v3
- Date: Fri, 07 Nov 2025 17:11:12 GMT
- Title: Large language models as uncertainty-calibrated optimizers for experimental discovery
- Authors: Bojana Ranković, Ryan-Rhys Griffiths, Philippe Schwaller,
- Abstract summary: We show that training language models through the uncertainty-aware objectives of traditional optimization methods enables their use as reliable overconfidence guided by natural language interfaces.<n>Our method nearly doubles the discovery rate of high-yielding reaction conditions, from 24% to 43% in 50 experimental starting from 10 unsuccessful conditions.
- Score: 4.968931211284832
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scientific discovery increasingly depends on efficient experimental optimization to navigate vast design spaces under time and resource constraints. Traditional approaches often require extensive domain expertise and feature engineering. While large language models, with their vast scientific knowledge, circumvent the feature engineering limitations, they lack the calibrated uncertainty estimates required for high-stakes decision making. Hence, current optimization methods force a choice between domain knowledge and reliability, with no principled approach that affords both. In this work, we show that training language models through the uncertainty-aware objectives of traditional optimization methods enables their use as reliable optimizers guided by natural language. By teaching LLMs from experimental outcomes under uncertainty, we transform their overconfidence from a fundamental limitation into a precise calibration mechanism. Applied to Buchwald-Hartwig reactions, a cornerstone of pharmaceutical synthesis, our method nearly doubles the discovery rate of high-yielding reaction conditions, from 24% to 43% in 50 experimental iterations starting from 10 unsuccessful conditions. Across 19 diverse optimization problems spanning organic synthesis, materials science and catalysis, process chemistry, and molecular design, our approach ranks first on average, establishing a new paradigm for reliable, uncertainty-guided optimization with LLMs. Our approach can accelerate discovery by lowering the barrier to using powerful optimization methods, replacing the need for domain-specific feature engineering with more accessible natural language interfaces. These findings highlight that ensuring reliability through principled uncertainty quantification is critical for realizing the full potential of AI-guided experimentation.
Related papers
- Physics-Informed Uncertainty Enables Reliable AI-driven Design [1.1104649308580707]
Inverse design is a central goal in science and engineering, including frequency-selective surfaces that are critical to microelectronics for telecommunications and optical metamaterials.<n>Traditional surrogate-assisted optimization methods using deep learning can accelerate the design process but do not usually incorporate uncertainty quantification.<n>Here, we introduce and validate a fundamentally different paradigm of Physics-Informed Uncertainty, where the degree to which a model's prediction violates fundamental physical laws serves as a computationally-cheap and effective proxy for predictive uncertainty.
arXiv Detail & Related papers (2026-01-26T16:10:59Z) - Can Large Language Models Express Uncertainty Like Human? [71.27418419522884]
We release the first diverse, large-scale dataset of hedging expressions with human-annotated confidence scores.<n>We conduct the first systematic study of linguistic confidence across modern large language models.
arXiv Detail & Related papers (2025-09-29T02:34:30Z) - Cognitive Loop via In-Situ Optimization: Self-Adaptive Reasoning for Science [1.309289689673624]
We introduce an alternative approach that enables deep and precise control over the reasoning process called: a cognitive loop via in-situ optimization (Clio)<n>Clio enables large language models to self-formulate ways of approaching a problem, adapt behavior when self-confidence is low, and ultimately provide scientists with a final belief or answer.<n>Without any further post-training, OpenAI's GPT-4.1 with CLIO yields an accuracy of 22.37% in text-based biology and medicine questions on Humanity's Last Exam (HLE)
arXiv Detail & Related papers (2025-08-04T18:01:35Z) - Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z) - New Insights on Unfolding and Fine-tuning Quantum Federated Learning [12.248184406275405]
This study addresses the core limitations of Quantum Federated Learning (QFL), streamlining its applicability to any complex challenges such as healthcare and genomic research.<n>By developing self adaptive fine tuning, the proposed method proves particularly effective in critical applications such as gene expression analysis and cancer detection, enhancing diagnostic precision and predictive modeling within quantum systems.
arXiv Detail & Related papers (2025-06-24T21:17:48Z) - Divide, Optimize, Merge: Fine-Grained LLM Agent Optimization at Scale [19.60416591361918]
Fine-Grained Optimization (FGO) is a scalable framework that divides large optimization tasks into manageable subsets, performs targeted optimizations, and systematically combines optimized components through progressive merging.<n> evaluation across ALFWorld, LogisticsQA, and GAIA benchmarks demonstrate that FGO outperforms existing approaches by 1.6-8.6% while reducing average prompt token consumption by 56.3%.
arXiv Detail & Related papers (2025-05-06T20:50:27Z) - Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning? [42.362388367152256]
Large language models (LLMs) are used to fine-tune a parameter-efficient version of Code Llama using LoRA.
Our method achieves competitive or superior results in terms of Root Mean Square Error (RMSE) while significantly reducing computational overhead.
arXiv Detail & Related papers (2025-04-08T13:15:47Z) - Make Optimization Once and for All with Fine-grained Guidance [78.14885351827232]
Learning to Optimize (L2O) enhances optimization efficiency with integrated neural networks.
L2O paradigms achieve great outcomes, e.g., refitting, generating unseen solutions iteratively or directly.
Our analyses explore general framework for learning optimization, called Diff-L2O, focusing on augmenting solutions from a wider view.
arXiv Detail & Related papers (2025-03-14T14:48:12Z) - Language-Based Bayesian Optimization Research Assistant (BORA) [0.0]
Contextualizing domain knowledge is a powerful approach to guide searches for fruitful regions.<n>Here, we propose use of Language Models (LLMs) for contextualizing search optimization.
arXiv Detail & Related papers (2025-01-27T17:20:04Z) - Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment [40.71270945505082]
Large language models (LLMs) are increasingly integrated into various societal and decision-making processes.<n>Traditional methods, such as reinforcement learning from human feedback (RLHF), achieve alignment by fine-tuning model parameters.<n>In contrast, prompt optimization is a viable alternative to RLHF for LLM alignment.
arXiv Detail & Related papers (2025-01-07T03:14:39Z) - Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning [10.457661605916435]
Large language models (LLMs) have revolutionized the field of natural language processing with their impressive reasoning and question-answering capabilities.<n>LLMs are sometimes prone to generating credible-sounding but incorrect information, a phenomenon known as hallucinations.<n>We introduce a novel uncertainty-aware causal language modeling loss function, grounded in the principles of decision theory.
arXiv Detail & Related papers (2024-12-03T23:14:47Z) - Can Learned Optimization Make Reinforcement Learning Less Difficult? [70.5036361852812]
We consider whether learned optimization can help overcome reinforcement learning difficulties.<n>Our method, Learned Optimization for Plasticity, Exploration and Non-stationarity (OPEN), meta-learns an update rule whose input features and output structure are informed by previously proposed to these difficulties.
arXiv Detail & Related papers (2024-07-09T17:55:23Z) - Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models.
We learn the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.
Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU.
arXiv Detail & Related papers (2024-06-15T09:31:03Z) - LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.<n>In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - Localized Zeroth-Order Prompt Optimization [54.964765668688806]
We propose a novel algorithm, namely localized zeroth-order prompt optimization (ZOPO)
ZOPO incorporates a Neural Tangent Kernel-based derived Gaussian process into standard zeroth-order optimization for an efficient search of well-performing local optima in prompt optimization.
Remarkably, ZOPO outperforms existing baselines in terms of both the optimization performance and the query efficiency.
arXiv Detail & Related papers (2024-03-05T14:18:15Z) - Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark [166.40879020706151]
This paper proposes a shift towards BP-free, zeroth-order (ZO) optimization as a solution for reducing memory costs during fine-tuning.
Unlike traditional ZO-SGD methods, our work expands the exploration to a wider array of ZO optimization techniques.
Our study unveils previously overlooked optimization principles, highlighting the importance of task alignment, the role of the forward gradient method, and the balance between algorithm complexity and fine-tuning performance.
arXiv Detail & Related papers (2024-02-18T14:08:48Z) - Uncertainty Quantification for Forward and Inverse Problems of PDEs via
Latent Global Evolution [110.99891169486366]
We propose a method that integrates efficient and precise uncertainty quantification into a deep learning-based surrogate model.
Our method endows deep learning-based surrogate models with robust and efficient uncertainty quantification capabilities for both forward and inverse problems.
Our method excels at propagating uncertainty over extended auto-regressive rollouts, making it suitable for scenarios involving long-term predictions.
arXiv Detail & Related papers (2024-02-13T11:22:59Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Learning to Generalize Provably in Learning to Optimize [185.71326306329678]
Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizees by data-driven approaches.
Current L2O methods often suffer from poor generalization performance in at least two folds.
We propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework.
arXiv Detail & Related papers (2023-02-22T01:17:31Z) - Incorporating Expert Prior Knowledge into Experimental Design via
Posterior Sampling [58.56638141701966]
Experimenters can often acquire the knowledge about the location of the global optimum.
It is unknown how to incorporate the expert prior knowledge about the global optimum into Bayesian optimization.
An efficient Bayesian optimization approach has been proposed via posterior sampling on the posterior distribution of the global optimum.
arXiv Detail & Related papers (2020-02-26T01:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.