NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents
- URL: http://arxiv.org/abs/2601.21372v1
- Date: Thu, 29 Jan 2026 07:57:23 GMT
- Title: NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents
- Authors: Yang Song, Anoushka Vyas, Zirui Wei, Sina Khoshfetrat Pakazad, Henrik Ohlsson, Graham Neubig,
- Abstract summary: We present NEMO, a system that translates Natural-language descriptions of decision problems into formal Executable Mathematical Optimization implementations.<n>NEMO centers on remote interaction with autonomous coding agents (ACAs), treated as a first-class abstraction analogous to API-based interaction with LLMs.<n>Because ACAs execute within sandboxed environments, code produced by NEMO is executable by construction, allowing automated validation and repair.
- Score: 41.70615840873279
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we present NEMO, a system that translates Natural-language descriptions of decision problems into formal Executable Mathematical Optimization implementations, operating collaboratively with users or autonomously. Existing approaches typically rely on specialized large language models (LLMs) or bespoke, task-specific agents. Such methods are often brittle, complex and frequently generating syntactically invalid or non-executable code. NEMO instead centers on remote interaction with autonomous coding agents (ACAs), treated as a first-class abstraction analogous to API-based interaction with LLMs. This design enables the construction of higher-level systems around ACAs that structure, consolidate, and iteratively refine task specifications. Because ACAs execute within sandboxed environments, code produced by NEMO is executable by construction, allowing automated validation and repair. Building on this, we introduce novel coordination patterns with and across ACAs, including asymmetric validation loops between independently generated optimizer and simulator implementations (serving as a high-level validation mechanism), external memory for experience reuse, and robustness enhancements via minimum Bayes risk (MBR) decoding and self-consistency. We evaluate NEMO on nine established optimization benchmarks. As depicted in Figure 1, it achieves state-of-the-art performance on the majority of tasks, with substantial margins on several datasets, demonstrating the power of execution-aware agentic architectures for automated optimization modeling.
Related papers
- The Auton Agentic AI Framework [5.410458076724158]
The field of Artificial Intelligence is undergoing a transition from Generative AI to Agentic AI.<n>This transition exposes a fundamental architectural mismatch: Large Language Models (LLMs) produce unstructured outputs, whereas the backend infrastructure they must control requires deterministic, schema-conformant inputs.<n>The present paper describes the Auton Agentic AI Framework, a principled architecture for the creation, creation, and governance of autonomous agent.
arXiv Detail & Related papers (2026-02-27T06:42:08Z) - MIRROR: A Multi-Agent Framework with Iterative Adaptive Revision and Hierarchical Retrieval for Optimization Modeling in Operations Research [15.28095645151852]
MIRROR is a fine-tuning-free, end-to-end multi-agent framework for operations research.<n>It translates natural language optimization problems into mathematical models and solver code.<n>Experiments show that MIRROR outperforms existing methods on standard Operations Research benchmarks.
arXiv Detail & Related papers (2026-02-03T09:46:56Z) - ComAgent: Multi-LLM based Agentic AI Empowered Intelligent Wireless Networks [62.031889234230725]
6G networks rely on complex cross-layer optimization.<n> manually translating high-level intents into mathematical formulations remains a bottleneck.<n>We present ComAgent, a multi-LLM agentic AI framework.
arXiv Detail & Related papers (2026-01-27T13:43:59Z) - Monadic Context Engineering [59.95390010097654]
This paper introduces Monadic Context Engineering (MCE) to provide a formal foundation for agent design.<n>We demonstrate how Monads enable robust composition, how Applicatives provide a principled structure for parallel execution, and crucially, how Monad Transformers allow for the systematic composition of these capabilities.<n>This layered approach enables developers to construct complex, resilient, and efficient AI agents from simple, independently verifiable components.
arXiv Detail & Related papers (2025-12-27T01:52:06Z) - Hybrid Agentic AI and Multi-Agent Systems in Smart Manufacturing [0.0]
This paper presents a hybrid agentic AI and multi agent framework for a Prescriptive Maintenance use case.<n>The proposed framework adopts a layered architecture that consists of perception, preprocessing, analytics, and optimization layers.<n> Specialized agents autonomously handle schema discovery, intelligent feature analysis, model selection, and prescriptive optimization.<n>An initial proof of concept implementation is validated on two industrial manufacturing datasets.
arXiv Detail & Related papers (2025-11-23T03:06:23Z) - EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence [17.644658293987955]
Embodied AI agents are capable of robust spatial perception, effective task planning, and adaptive execution in physical environments.<n>Current large language models (LLMs) and multimodal LLMs (MLLMs) for embodied tasks suffer from key limitations.<n>We propose EmbodiedBrain, a novel vision-language foundation model available in both 7B and 32B parameter sizes.
arXiv Detail & Related papers (2025-10-23T14:05:55Z) - Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting [92.57796055887995]
We introduce ECHO, a prompting framework that adapts hindsight experience replay from reinforcement learning for language model agents.<n> ECHO generates optimized trajectories for alternative goals that could have been achieved during failed attempts.<n>We evaluate ECHO on stateful versions of XMiniGrid, a text-based navigation and planning benchmark, and PeopleJoinQA, a collaborative information-gathering enterprise simulation.
arXiv Detail & Related papers (2025-10-11T18:11:09Z) - Blueprint First, Model Second: A Framework for Deterministic LLM Workflow [3.9886771197662925]
We introduce the Source Code Agent framework, a new paradigm built on the "Blueprint First, Model Second" philosophy.<n>Our framework decouples the workflow logic from the generative model.<n>Our work enables the verifiable and reliable deployment of autonomous agents in applications governed by strict procedural logic.
arXiv Detail & Related papers (2025-08-01T03:10:00Z) - MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision [76.42361936804313]
We introduce MAS-ZERO, the first self-evolved, inference-time framework for automatic MAS design.<n> MAS-ZERO employs meta-level design to iteratively generate, evaluate, and refine MAS configurations tailored to each problem instance.
arXiv Detail & Related papers (2025-05-21T00:56:09Z) - BLADE: Benchmark suite for LLM-driven Automated Design and Evolution of iterative optimisation heuristics [2.2485774453793037]
BLADE is a framework for benchmarking LLM-driven AAD methods in a continuous black-box optimisation context.<n>It integrates benchmark problems with instance generators and textual descriptions aimed at capability-focused testing, such as specialisation and information exploitation.<n> BLADE provides an out-of-the-box' solution to systematically evaluate LLM-driven AAD approaches.
arXiv Detail & Related papers (2025-04-28T18:34:09Z) - Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z) - Towards more Contextual Agents: An extractor-Generator Optimization Framework [0.0]
Large Language Model (LLM)-based agents have demonstrated remarkable success in solving complex tasks across a wide range of general-purpose applications.<n>However, their performance often degrades in context-specific scenarios, such as specialized industries or research domains.<n>To address this challenge, our work introduces a systematic approach to enhance the contextual adaptability of LLM-based agents.
arXiv Detail & Related papers (2025-02-18T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.