Related papers: Combining Large Language Models and Gradient-Free Optimization for Automatic Control Policy Synthesis

Combining Large Language Models and Gradient-Free Optimization for Automatic Control Policy Synthesis

URL: http://arxiv.org/abs/2510.00373v1
Date: Wed, 01 Oct 2025 00:42:15 GMT
Title: Combining Large Language Models and Gradient-Free Optimization for Automatic Control Policy Synthesis
Authors: Carlo Bosio, Matteo Guarrera, Alberto Sangiovanni-Vincentelli, Mark W. Mueller,
Abstract summary: Large Language models (LLMs) have shown promise as generators of symbolic control policies.<n>We propose a hybrid approach that decouples structural synthesis from parameter optimization.<n>We show that combining symbolic program synthesis with numerical optimization yields interpretable yet high-performing policies.
Score: 2.8593976574111264
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language models (LLMs) have shown promise as generators of symbolic control policies, producing interpretable program-like representations through iterative search. However, these models are not capable of separating the functional structure of a policy from the numerical values it is parametrized by, thus making the search process slow and inefficient. We propose a hybrid approach that decouples structural synthesis from parameter optimization by introducing an additional optimization layer for local parameter search. In our method, the numerical parameters of LLM-generated programs are extracted and optimized numerically to maximize task performance. With this integration, an LLM iterates over the functional structure of programs, while a separate optimization loop is used to find a locally optimal set of parameters accompanying candidate programs. We evaluate our method on a set of control tasks, showing that it achieves higher returns and improved sample efficiency compared to purely LLM-guided search. We show that combining symbolic program synthesis with numerical optimization yields interpretable yet high-performing policies, bridging the gap between language-model-guided design and classical control tuning. Our code is available at https://sites.google.com/berkeley.edu/colmo.

Related papers

OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization [21.882017397032964]
We present OptiML, an end-to-end framework that maps either natural-language intent or input code to performance-optimized kernels.<n>A search-based (OptiML-X) then refines either synthesized or user-provided kernels using Monte Carlo Tree Search over LLM-aware, guided by a hardware-driven reward derived from profiler feedback.
arXiv Detail & Related papers (2026-02-12T04:50:19Z)
Policy-Conditioned Policies for Multi-Agent Task Solving [53.67744322553693]
In this work, we propose a paradigm shift that bridges the gap by representing policies as human-interpretable source code.<n>We reformulate the learning problem by utilizing Large Language Models (LLMs) as approximate interpreters.<n>We formalize this process as textitProgrammatic Iterated Best Response (PIBR), an algorithm where the policy code is optimized by textual gradients.
arXiv Detail & Related papers (2025-12-24T07:42:10Z)
LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models [23.6344001089164]
LOOPRAG is a retrieval-augmented generation framework designed to guide Large Language Models (LLMs) in performing effective loop optimization.<n>We introduce a parameter-driven method to harness loop properties, which trigger various loop transformations, and generate diverse yet legal example codes.<n>To enhance correct and efficient code generation, we introduce a feedback-based iterative mechanism that incorporates compilation, testing and performance results.
arXiv Detail & Related papers (2025-12-12T11:09:48Z)
SOCRATES: Simulation Optimization with Correlated Replicas and Adaptive Trajectory Evaluations [25.18297372152296]
SOCRATES is a novel two-stage procedure that automates the design of tailored SO algorithms.<n>An ensemble of digital replicas of the real system is used as a testbed to evaluate a set of baseline SO algorithms.<n>An LLM acts as a meta-optimizer, analyzing the performance trajectories of these algorithms to iteratively revise and compose a final, hybrid optimization schedule.
arXiv Detail & Related papers (2025-11-01T19:57:38Z)
Optimizing Prompt Sequences using Monte Carlo Tree Search for LLM-Based Optimization [20.44067161623662]
Large language models (LLMs) have demonstrated remarkable capabilities in code generation and structured reasoning.<n>We propose a novel neural-symbolic framework that formulates prompt selection as a sequential decision process guided by Monte Carlo Tree Search.<n>Our method explores and refines multi-step prompt sequences for the goal of improving code generation quality.
arXiv Detail & Related papers (2025-08-08T04:01:24Z)
Compiler Optimization via LLM Reasoning for Efficient Model Serving [7.257845254223727]
We introduce a compilation framework (dubbed REASONING COMPILER) that formulates optimization as a sequential, context-aware decision process.<n>We achieve substantial speedups with markedly fewer samples than leading neural compilers.
arXiv Detail & Related papers (2025-06-02T07:02:46Z)
Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning? [45.58422897857411]
This work explores the use of large language models (LLMs) for hyperparameter optimization by fine-tuning a parameter-efficient version of Code Llama using LoRA.<n>Our approach achieves competitive or superior Root Mean Square Error (RMSE) while substantially reducing computational overhead.<n>Results demonstrate that LLM-based optimization not only rivals established Bayesian methods like Tree-structured Parzen Estimators (TPE) but also accelerates tuning for real-world applications requiring perceptual quality and low-latency processing.
arXiv Detail & Related papers (2025-04-08T13:15:47Z)
Training of Scaffolded Language Models with Language Supervision: A Survey [62.59629932720519]
This survey organizes the literature on the design and optimization of emerging structures around post-trained LMs.<n>We refer to this overarching structure as scaffolded LMs and focus on LMs that are integrated into multi-step processes with tools.
arXiv Detail & Related papers (2024-10-21T18:06:25Z)
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling [62.19438812624467]
Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning.<n>We propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z)
LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.<n>In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z)
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers [108.72225067368592]
We propose a novel perspective to investigate the design of large language models (LLMs)-based prompts.<n>We identify two pivotal factors in model parameter learning: update direction and update method.<n>We develop a capable Gradient-inspired Prompt-based GPO.
arXiv Detail & Related papers (2024-02-27T15:05:32Z)
Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z)
Adaptive pruning-based optimization of parameterized quantum circuits [62.997667081978825]
Variisy hybrid quantum-classical algorithms are powerful tools to maximize the use of Noisy Intermediate Scale Quantum devices. We propose a strategy for such ansatze used in variational quantum algorithms, which we call "Efficient Circuit Training" (PECT) Instead of optimizing all of the ansatz parameters at once, PECT launches a sequence of variational algorithms.
arXiv Detail & Related papers (2020-10-01T18:14:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.