Hierarchical Reinforcement Learning Framework for Stochastic Spaceflight
Campaign Design
- URL: http://arxiv.org/abs/2103.08981v1
- Date: Tue, 16 Mar 2021 11:17:02 GMT
- Title: Hierarchical Reinforcement Learning Framework for Stochastic Spaceflight
Campaign Design
- Authors: Yuji Takubo, Hao Chen, and Koki Ho
- Abstract summary: This paper develops a hierarchical reinforcement learning architecture for spaceflight campaign design under uncertainty.
It is applied to a set of human lunar exploration campaign scenarios with uncertain in-situ resource utilization (ISRU) performance.
- Score: 5.381116150823982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper develops a hierarchical reinforcement learning architecture for
multi-mission spaceflight campaign design under uncertainty, including vehicle
design, infrastructure deployment planning, and space transportation
scheduling. This problem involves a high-dimensional design space and is
challenging especially with uncertainty present. To tackle this challenge, the
developed framework has a hierarchical structure with reinforcement learning
(RL) and network-based mixed-integer linear programming (MILP), where the
former optimizes campaign-level decisions (e.g., design of the vehicle used
throughout the campaign, destination demand assigned to each mission in the
campaign), whereas the latter optimizes the detailed mission-level decisions
(e.g., when to launch what from where to where). The framework is applied to a
set of human lunar exploration campaign scenarios with uncertain in-situ
resource utilization (ISRU) performance as a case study. The main value of this
work is its integration of the rapidly growing RL research and the existing
MILP-based space logistics methods through a hierarchical framework to handle
the otherwise intractable complexity of space mission design under uncertainty.
We expect this unique framework to be a critical steppingstone for the emerging
research direction of artificial intelligence for space mission design.
Related papers
- Structural Induced Exploration for Balanced and Scalable Multi-Robot Path Planning [6.823580643749891]
Multi-robot path planning is a fundamental yet challenging problem due to its complexity and the need to balance global efficiency with fair task allocation among robots.<n>Traditional swarm intelligence methods, although effective on small instances, often converge prematurely and struggle to scale to complex environments.<n>We present a structure-induced exploration framework that integrates structural priors into the search process of the ant colony optimization (ACO)
arXiv Detail & Related papers (2025-12-25T12:53:24Z) - Leveraging High-Fidelity Digital Models and Reinforcement Learning for Mission Engineering: A Case Study of Aerial Firefighting Under Perfect Information [1.0832844764942349]
Mission environments are uncertain, dynamic, and mission outcomes are a direct function of how the mission assets will interact with this environment.<n>This paper proposes an intelligent mission coordination methodology that integrates digital mission models with Reinforcement Learning (RL)
arXiv Detail & Related papers (2025-12-23T18:36:07Z) - STRIDER: Navigation via Instruction-Aligned Structural Decision Space Optimization [73.98141357780032]
VLN-CE task requires agents to navigate 3D environments using natural language instructions, without any scene-specific training.<n>Existing methods often fail to achieve robust navigation due to a lack of structured decision-making and insufficient integration of feedback from previous actions.<n>We propose STRIDER, a novel framework that systematically optimize the agent's decision space by integrating spatial layout priors and dynamic task feedback.<n>Our approach introduces two key innovations: 1) a Structured Waypoint Generator that constrains the action space through spatial structure, and 2) a Task-Alignment Regulator that adjusts behavior based on task progress, ensuring semantic alignment throughout navigation.
arXiv Detail & Related papers (2025-10-27T04:37:21Z) - Agile Tradespace Exploration for Space Rendezvous Mission Design via Transformers [22.891825351056823]
Spacecraft rendezvous enables on-orbit servicing and debris removal, forming the foundation for a scalable space economy.<n>This paper proposes a framework that can be used to design missions for a wide range of flight times.<n>The framework provides high-quality initial guesses that generalize to solutions in fewer iterations.
arXiv Detail & Related papers (2025-10-03T22:28:46Z) - From reactive to cognitive: brain-inspired spatial intelligence for embodied agents [50.99942960312313]
Brain-inspired Spatial Cognition for Navigation (BSC-Nav) is a unified framework for constructing and leveraging structured spatial memory in embodied agents.<n> BSC-Nav builds allocentric cognitive maps from egocentric trajectories and contextual cues, and dynamically retrieves spatial knowledge aligned with semantic goals.
arXiv Detail & Related papers (2025-08-24T03:20:48Z) - Integrating Symbolic RL Planning into a BDI-based Autonomous UAV Framework: System Integration and SIL Validation [3.5966087153300057]
We propose an extended version of the Autonomous Mission Agents for Drones (AMAD) cognitive multi-agent architecture, enhanced with symbolic reinforcement learning for dynamic mission planning and execution.<n>We validated our framework in a Software-in-the-Loop (SIL) environment structured identically to an intended Hardware-In-the-Loop Simulation (HILS) platform.<n> Experimental results demonstrate stable integration and interoperability of modules, successful transitions between BDI-driven and symbolic RL-driven planning phases, and consistent mission performance.
arXiv Detail & Related papers (2025-08-16T03:27:26Z) - EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks [24.41705039390567]
EmbodiedVSR (Embodied Visual Spatial Reasoning) is a novel framework that integrates dynamic scene graph-guided Chain-of-Thought (CoT) reasoning.
Our method enables zero-shot spatial reasoning without task-specific fine-tuning.
Experiments demonstrate that our framework significantly outperforms existing MLLM-based methods in accuracy and reasoning coherence.
arXiv Detail & Related papers (2025-03-14T05:06:07Z) - SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning [42.487500113839666]
We propose a novel approach to bolster the spatial reasoning capabilities of Vision-Language Models (VLMs)
Our approach comprises two stages: spatial coordinate bi-directional alignment, and chain-of-thought spatial grounding.
We evaluate our method on challenging navigation and manipulation tasks, both in simulation and real-world settings.
arXiv Detail & Related papers (2025-01-17T09:46:27Z) - Self-reconfiguration Strategies for Space-distributed Spacecraft [17.70060501010008]
This paper proposes a distributed on-orbit spacecraft assembly algorithm, where future spacecraft can assemble modules with different functions on orbit.
Reasonable and efficient on-orbit self-reconfiguration algorithms play a crucial role in realizing the benefits of distributed spacecraft.
arXiv Detail & Related papers (2024-11-26T06:05:44Z) - LLMSat: A Large Language Model-Based Goal-Oriented Agent for Autonomous Space Exploration [0.0]
This work explores the application of Large Language Models (LLMs) as the high-level control system of a spacecraft.
A series of deep space mission scenarios simulated within the popular game engine Kerbal Space Program are used as case studies to evaluate the implementation against the requirements.
arXiv Detail & Related papers (2024-04-13T03:33:17Z) - Long-HOT: A Modular Hierarchical Approach for Long-Horizon Object
Transport [83.06265788137443]
We address key challenges in long-horizon embodied exploration and navigation by proposing a new object transport task and a novel modular framework for temporally extended navigation.
Our first contribution is the design of a novel Long-HOT environment focused on deep exploration and long-horizon planning.
We propose a modular hierarchical transport policy (HTP) that builds a topological graph of the scene to perform exploration with the help of weighted frontiers.
arXiv Detail & Related papers (2022-10-28T05:30:49Z) - Decentralized Vehicle Coordination: The Berkeley DeepDrive Drone Dataset and Consensus-Based Models [76.32775745488073]
We present a novel dataset and modeling framework designed to study motion planning in understructured environments.
We demonstrate that a consensus-based modeling approach can effectively explain the emergence of priority orders observed in our dataset.
arXiv Detail & Related papers (2022-09-19T05:06:57Z) - Overcoming Exploration: Deep Reinforcement Learning in Complex
Environments from Temporal Logic Specifications [2.8904578737516764]
We present a Deep Reinforcement Learning (DRL) algorithm for a task-guided robot with unknown continuous-time dynamics deployed in a large-scale complex environment.
Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale complex environments.
arXiv Detail & Related papers (2022-01-28T16:39:08Z) - Successor Feature Landmarks for Long-Horizon Goal-Conditioned
Reinforcement Learning [54.378444600773875]
We introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments.
SFL drives exploration by estimating state-novelty and enables high-level planning by abstracting the state-space as a non-parametric landmark-based graph.
We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces.
arXiv Detail & Related papers (2021-11-18T18:36:05Z) - Design Strategy Network: A deep hierarchical framework to represent
generative design strategies in complex action spaces [0.0]
This work introduces Design Strategy Network (DSN), a data-driven deep hierarchical framework that learns strategies over arbitrary complex action spaces.
The hierarchical architecture decomposes every action decision into first predicting a preferred spatial region in the design space.
Results show that DSNs significantly outperform non-hierarchical methods of policy representation.
arXiv Detail & Related papers (2021-10-07T19:29:40Z) - Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments.
Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal.
We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z) - Temporal Predictive Coding For Model-Based Planning In Latent Space [80.99554006174093]
We present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time.
We evaluate our model on a challenging modification of standard DMControl tasks where the background is replaced with natural videos that contain complex but irrelevant information to the planning task.
arXiv Detail & Related papers (2021-06-14T04:31:15Z) - Constrained optimisation of preliminary spacecraft configurations under
the design-for-demise paradigm [1.0205541448656992]
Most mid-sized satellites currently launched and already in orbit fail to comply with the casualty risk threshold of 0.0001.
Satellites manufacturers and mission operators need to perform a disposal through a controlled re-entry.
This additional cost and complexity can be removed as the spacecraft is directly compliant with the casualty risk regulations.
arXiv Detail & Related papers (2020-12-27T17:48:29Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot
Transfer [71.44215606325005]
We propose a novel framework called Jump-Operator Dynamic Programming for quickly computing solutions within a super-exponential space of sequential sub-goal tasks.
This approach involves controlling over an ensemble of reusable goal-conditioned polices functioning as temporally extended actions.
We then identify classes of objective functions on this subspace whose solutions are invariant to the grounding, resulting in optimal zero-shot transfer.
arXiv Detail & Related papers (2020-07-06T05:13:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.