Related papers: Agent Mars: Multi-Agent Simulation for Multi-Planetary Life Exploration and Settlement

Agent Mars: Multi-Agent Simulation for Multi-Planetary Life Exploration and Settlement

URL: http://arxiv.org/abs/2602.13291v1
Date: Mon, 09 Feb 2026 00:29:06 GMT
Title: Agent Mars: Multi-Agent Simulation for Multi-Planetary Life Exploration and Settlement
Authors: Ziyang Wang,
Abstract summary: Space exploration and settlement offer vast environments and resources, but impose constraints unmatched on Earth.<n>Key challenge is auditable coordination among specialised humans, robots, and digital services in a safety-critical system-of-systems.<n>We introduce Agent Mars, an open, end-to-end multi-agent simulation framework for Mars base operations.
Score: 17.969021804498844
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Artificial Intelligence (AI) has transformed robotics, healthcare, industry, and scientific discovery, yet a major frontier may lie beyond Earth. Space exploration and settlement offer vast environments and resources, but impose constraints unmatched on Earth: delayed/intermittent communications, extreme resource scarcity, heterogeneous expertise, and strict safety, accountability, and command authority. The key challenge is auditable coordination among specialised humans, robots, and digital services in a safety-critical system-of-systems. We introduce Agent Mars, an open, end-to-end multi-agent simulation framework for Mars base operations. Agent Mars formalises a realistic organisation with a 93-agent roster across seven layers of command and execution (human roles and physical assets), enabling base-scale studies beyond toy settings. It implements hierarchical and cross-layer coordination that preserves chain-of-command while allowing vetted cross-layer exchanges with audit trails; supports dynamic role handover with automatic failover under outages; and enables phase-dependent leadership for routine operations, emergencies, and science campaigns. Agent Mars further models mission-critical mechanisms-scenario-aware short/long-horizon memory, configurable propose-vote consensus, and translator-mediated heterogeneous protocols-to capture how teams align under stress. To quantify behaviour, we propose the Agent Mars Performance Index (AMPI), an interpretable composite score with diagnostic sub-metrics. Across 13 reproducible Mars-relevant operational scripts, Agent Mars reveals coordination trade-offs and identifies regimes where curated cross-layer collaboration and functional leadership reduce overhead without sacrificing reliability. Agent Mars provides a benchmarkable, auditable foundation for Space AI.

Related papers

AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems [71.89040853616602]
We introduce AstroReason-Bench, a benchmark for evaluating agentic planning in Space Planning Problems (SPP)<n>AstroReason-Bench integrates multiple scheduling regimes, including ground station communication and agile Earth observation, and provides a unified agent-oriented interaction protocol.<n>We find that current agents substantially underperform specialized solvers, highlighting key limitations of generalist planning under realistic constraints.
arXiv Detail & Related papers (2026-01-16T15:02:41Z)
What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm for Studying Environment Understanding [50.35012849818872]
Large language model (LLM) agents have demonstrated remarkable capabilities in complex decision-making and tool-use tasks.<n>We propose Task-to-Quiz (T2Q), a deterministic and automated evaluation paradigm designed to decouple task execution from world-state understanding.<n>Our experiments reveal that task success is often a poor proxy for environment understanding, and that current memory machanism can not effectively help agents acquire a grounded model of the environment.
arXiv Detail & Related papers (2026-01-14T14:09:11Z)
Heterogeneous Robot Collaboration in Unstructured Environments with Grounded Generative Intelligence [54.91177026001217]
Large language model (LLM)-enabled teaming methods typically assume well-structured and known environments.<n>We present SPINE-HT, a framework that addresses these limitations by grounding the reasoning abilities of LLMs in the context of a heterogeneous robot team.<n>Our framework achieves nearly twice the success rate compared to prior LLM-enabled heterogeneous teaming approaches.
arXiv Detail & Related papers (2025-10-30T18:24:38Z)
Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm [60.36837655498119]
We propose a Trajectory-based validated-by-Reproducing Agent-benchmark Complexity Evolution framework.<n>This framework takes an original task from an existing benchmark and encourages agents to evolve it into a new task with higher difficulty.<n>Experiments on the GAIA benchmark demonstrate that the TRACE framework consistently enhances task complexity while improving the reliability of correctness.
arXiv Detail & Related papers (2025-10-01T01:52:52Z)
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios [63.67884284105684]
We introduce textbfUltraHorizon, a novel benchmark that measures the foundational capabilities essential for complex real-world challenges.<n>Agents are designed in long-horizon discovery tasks where they must iteratively uncover hidden rules.<n>Our experiments reveal that LLM-agents consistently underperform in these settings, whereas human participants achieve higher scores.
arXiv Detail & Related papers (2025-09-26T02:04:00Z)
From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario [3.5262044630932254]
Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints.<n>Despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited.
arXiv Detail & Related papers (2025-08-06T17:54:10Z)
Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study [10.393102715510937]
The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions.<n>Traditional optimisation approaches struggle to handle the real-time decision-making demands of dynamic EO missions.<n>We investigate RL-based autonomous EO mission planning by modelling single-satellite operations and extending to multi-satellite constellations.
arXiv Detail & Related papers (2025-06-18T07:42:11Z)
From Virtual Agents to Robot Teams: A Multi-Robot Framework Evaluation in High-Stakes Healthcare Context [2.016235597066821]
Current frameworks treat agents as conceptual task executors rather than physically embodied entities.<n>We propose three design guidelines emphasizing process transparency, proactive failure recovery, and contextual grounding.<n>Our work informs the development of more resilient and robust multi-agent robotic systems.
arXiv Detail & Related papers (2025-06-04T04:05:38Z)
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments [116.97648507802926]
Large language models (LLMs) are considered a promising foundation to build such agents. We take the first step towards building generally-capable LLM-based agents with self-evolution ability. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration.
arXiv Detail & Related papers (2024-06-06T15:15:41Z)
We Choose to Go to Space: Agent-driven Human and Multi-Robot Collaboration in Microgravity [28.64243893838686]
Future space exploration requires humans and robots to work together. We present SpaceAgents-1, a system for learning human and multi-robot collaboration strategies under microgravity conditions.
arXiv Detail & Related papers (2024-02-22T05:32:27Z)
Enabling Astronaut Self-Scheduling using a Robust Advanced Modelling and Scheduling system: an assessment during a Mars analogue mission [44.621922701019336]
We study the usage of a computer decision-support tool by a crew of analog astronauts. The proposed tool, called Romie, belongs to the new category of Robust Advanced Modelling and Scheduling (RAMS) systems.
arXiv Detail & Related papers (2023-01-14T21:10:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.