Related papers: MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding

MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding

URL: http://arxiv.org/abs/2510.08804v1
Date: Thu, 09 Oct 2025 20:35:23 GMT
Title: MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding
Authors: Siddeshwar Raghavan, Tanwi Mallick,
Abstract summary: MOSAIC is a training-free framework with specially designed agents to self-reflect, create the rationale, code, and debug within a student-teacher paradigm.<n>We evaluate MOSAIC on scientific coding benchmarks and demonstrate that our specialized agentic framework outperforms existing approaches in terms of accuracy, robustness, and interpretability.
Score: 5.470408942595905
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present MOSAIC, a multi-agent Large Language Model (LLM) framework for solving challenging scientific coding tasks. Unlike general-purpose coding, scientific workflows require algorithms that are rigorous, interconnected with deep domain knowledge, and incorporate domain-specific reasoning, as well as algorithm iteration without requiring I/O test cases. Many scientific problems also require a sequence of subproblems to be solved, leading to the final desired result. MOSAIC is designed as a training-free framework with specially designed agents to self-reflect, create the rationale, code, and debug within a student-teacher paradigm to address the challenges of scientific code generation. This design facilitates stepwise problem decomposition, targeted error correction, and, when combined with our Consolidated Context Window (CCW), mitigates LLM hallucinations when solving complex scientific tasks involving chained subproblems. We evaluate MOSAIC on scientific coding benchmarks and demonstrate that our specialized agentic framework outperforms existing approaches in terms of accuracy, robustness, and interpretability.

Related papers

AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework [4.782965804438204]
Large Language Models (LLMs) demonstrate potentials for automating scientific code generation but face challenges in reliability, error propagation, and evaluation.<n>We present a Bayesian adversarial multi-agent framework specifically designed for AI for Science (AI4S) tasks in the form of a Low-code Platform (LCP)<n>Three LLM-based agents are coordinated under the Bayesian framework: a Task Manager that structures user inputs into actionable plans and adaptive test cases, a Code Generator that produces candidate solutions, and an Evaluator providing comprehensive feedback.
arXiv Detail & Related papers (2026-03-03T18:25:00Z)
ComAgent: Multi-LLM based Agentic AI Empowered Intelligent Wireless Networks [62.031889234230725]
6G networks rely on complex cross-layer optimization.<n> manually translating high-level intents into mathematical formulations remains a bottleneck.<n>We present ComAgent, a multi-LLM agentic AI framework.
arXiv Detail & Related papers (2026-01-27T13:43:59Z)
ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms [4.235429894371577]
ATHENA is an agentic framework designed as an Autonomous Lab to manage the end-to-end computational research lifecycle.<n>Its core is the HENA loop, a knowledge-driven diagnostic process framed as a Contextual problem.<n>The framework achieves super-human performance, reaching validation errors of $10-14$.
arXiv Detail & Related papers (2025-12-03T06:05:27Z)
SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning [54.186990494217916]
SciAgent is a unified multi-agent system designed for generalistic scientific reasoning.<n>A Coordinator Agent interprets each problem's domain and complexity, dynamically orchestrating specialized Worker Systems.<n>These Worker Systems are composed of interacting reasoning Sub-agents for symbolic deduction, conceptual modeling, numerical computation, and verification.
arXiv Detail & Related papers (2025-11-11T12:00:34Z)
Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision [4.55391222496256]
Large language models (LLMs) serve as an active and promising field of generative artificial intelligence.<n>In this work, we construct a novel agent framework for solving representative problems in scientific computing.<n>The proposed agent, incorporating a "rewriting-resolution-review-revision" logical chain, is integrated in a collaborative and interactive manner.
arXiv Detail & Related papers (2025-08-28T12:50:48Z)
Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z)
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows [82.07367406991678]
Large Language Models (LLMs) have extended their impact beyond Natural Language Processing.<n>Among these, computer-using agents are capable of interacting with operating systems as humans do.<n>We introduce ScienceBoard, which encompasses a realistic, multi-domain environment featuring dynamic and visually rich scientific software.
arXiv Detail & Related papers (2025-05-26T12:27:27Z)
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges [72.19809898215857]
We introduce ModelingBench, a novel benchmark featuring real-world-inspired, open-ended problems from math modeling competitions across diverse domains.<n>These tasks require translating natural language into formal mathematical formulations, applying appropriate tools, and producing structured, defensible reports.<n>We also present ModelingAgent, a multi-agent framework that coordinates tool use, supports structured, creative solutions, and generates well-grounded, creative solutions.
arXiv Detail & Related papers (2025-05-21T03:33:23Z)
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory [15.24542569393982]
Despite their successes, deep learning models struggle with tasks requiring complex reasoning and function composition.<n>We present a theoretical and empirical investigation into the limitations of Structured State Space Models (SSMs) and Transformers in such tasks.<n>We highlight the need for innovative solutions to achieve reliable multi-step reasoning and compositional task-solving.
arXiv Detail & Related papers (2024-05-26T19:33:23Z)
Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.