Leveraging High-Fidelity Digital Models and Reinforcement Learning for Mission Engineering: A Case Study of Aerial Firefighting Under Perfect Information
- URL: http://arxiv.org/abs/2512.20589v2
- Date: Mon, 29 Dec 2025 20:24:58 GMT
- Title: Leveraging High-Fidelity Digital Models and Reinforcement Learning for Mission Engineering: A Case Study of Aerial Firefighting Under Perfect Information
- Authors: İbrahim Oğuz Çetinkaya, Sajad Khodadadian, Taylan G. Topcu,
- Abstract summary: Mission environments are uncertain, dynamic, and mission outcomes are a direct function of how the mission assets will interact with this environment.<n>This paper proposes an intelligent mission coordination methodology that integrates digital mission models with Reinforcement Learning (RL)
- Score: 1.0832844764942349
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As systems engineering (SE) objectives evolve from design and operation of monolithic systems to complex System of Systems (SoS), the discipline of Mission Engineering (ME) has emerged which is increasingly being accepted as a new line of thinking for the SE community. Moreover, mission environments are uncertain, dynamic, and mission outcomes are a direct function of how the mission assets will interact with this environment. This proves static architectures brittle and calls for analytically rigorous approaches for ME. To that end, this paper proposes an intelligent mission coordination methodology that integrates digital mission models with Reinforcement Learning (RL), that specifically addresses the need for adaptive task allocation and reconfiguration. More specifically, we are leveraging a Digital Engineering (DE) based infrastructure that is composed of a high-fidelity digital mission model and agent-based simulation; and then we formulate the mission tactics management problem as a Markov Decision Process (MDP), and employ an RL agent trained via Proximal Policy Optimization. By leveraging the simulation as a sandbox, we map the system states to actions, refining the policy based on realized mission outcomes. The utility of the RL-based intelligent mission coordinator is demonstrated through an aerial firefighting case study. Our findings indicate that the RL-based intelligent mission coordinator not only surpasses baseline performance but also significantly reduces the variability in mission performance. Thus, this study serves as a proof of concept demonstrating that DE-enabled mission simulations combined with advanced analytical tools offer a mission-agnostic framework for improving ME practice; which can be extended to more complicated fleet design and selection problems in the future from a mission-first perspective.
Related papers
- EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence [17.644658293987955]
Embodied AI agents are capable of robust spatial perception, effective task planning, and adaptive execution in physical environments.<n>Current large language models (LLMs) and multimodal LLMs (MLLMs) for embodied tasks suffer from key limitations.<n>We propose EmbodiedBrain, a novel vision-language foundation model available in both 7B and 32B parameter sizes.
arXiv Detail & Related papers (2025-10-23T14:05:55Z) - The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [103.32591749156416]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z) - Integrating Symbolic RL Planning into a BDI-based Autonomous UAV Framework: System Integration and SIL Validation [3.5966087153300057]
We propose an extended version of the Autonomous Mission Agents for Drones (AMAD) cognitive multi-agent architecture, enhanced with symbolic reinforcement learning for dynamic mission planning and execution.<n>We validated our framework in a Software-in-the-Loop (SIL) environment structured identically to an intended Hardware-In-the-Loop Simulation (HILS) platform.<n> Experimental results demonstrate stable integration and interoperability of modules, successful transitions between BDI-driven and symbolic RL-driven planning phases, and consistent mission performance.
arXiv Detail & Related papers (2025-08-16T03:27:26Z) - UAS Visual Navigation in Large and Unseen Environments via a Meta Agent [0.13654846342364302]
We propose a meta-curriculum training scheme to efficiently learn to navigate in large-scale urban environments.<n>We organize the training curriculum in a hierarchical manner such that the agent is guided from coarse to fine towards the target task.<n>In contrast to traditional reinforcement learning (RL), which focuses on acquiring a policy for a specific task, MRL aims to learn a policy with fast transfer ability to novel tasks.
arXiv Detail & Related papers (2025-03-20T01:44:59Z) - Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.<n>However, they still struggle with problems requiring multi-step decision-making and environmental feedback.<n>We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Probabilistic Mission Design in Neuro-Symbolic Systems [19.501311018760177]
Probabilistic Mission Design (ProMis) is a system architecture that links geospatial and sensory data with declarative, Hybrid Probabilistic Logic Programs (HPLP)<n>ProMis generates Probabilistic Mission Landscapes (PML), which quantify the agent's belief that a set of mission conditions is satisfied across its navigation space.<n>We show its integration with potent machine learning models such as Large Language Models (LLM) and Transformer-based vision models.
arXiv Detail & Related papers (2024-12-25T11:04:00Z) - Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate [118.37653302885607]
We present the Modality Integration Rate (MIR), an effective, robust, and generalized metric to indicate the multi-modal pre-training quality of Large Vision Language Models (LVLMs)
MIR is indicative about training data selection, training strategy schedule, and model architecture design to get better pre-training results.
arXiv Detail & Related papers (2024-10-09T17:59:04Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Hierarchical Reinforcement Learning Framework for Stochastic Spaceflight
Campaign Design [5.381116150823982]
This paper develops a hierarchical reinforcement learning architecture for spaceflight campaign design under uncertainty.
It is applied to a set of human lunar exploration campaign scenarios with uncertain in-situ resource utilization (ISRU) performance.
arXiv Detail & Related papers (2021-03-16T11:17:02Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.