Related papers: Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling

Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling

URL: http://arxiv.org/abs/2602.17685v1
Date: Wed, 04 Feb 2026 22:15:14 GMT
Title: Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling
Authors: Agni Bandyopadhyay, Gunther Waxenegger-Wilfing,
Abstract summary: This paper introduces a unified coelliptic maneuver framework that combines Hohmann transfers, safety proximity operations, and explicit refueling logic.<n>We benchmark three distinct planning algorithms Greedy, Monte Carlo Tree Search (MCTS), and deep reinforcement learning (RL)<n> Experimental results over 100 test scenarios demonstrate that Masked PPO achieves superior mission efficiency and computational performance.
Score: 22.261628532402067
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the challenge of multi target active debris removal (ADR) in Low Earth Orbit (LEO) by introducing a unified coelliptic maneuver framework that combines Hohmann transfers, safety ellipse proximity operations, and explicit refueling logic. We benchmark three distinct planning algorithms Greedy heuristic, Monte Carlo Tree Search (MCTS), and deep reinforcement learning (RL) using Masked Proximal Policy Optimization (PPO) within a realistic orbital simulation environment featuring randomized debris fields, keep out zones, and delta V constraints. Experimental results over 100 test scenarios demonstrate that Masked PPO achieves superior mission efficiency and computational performance, visiting up to twice as many debris as Greedy and significantly outperforming MCTS in runtime. These findings underscore the promise of modern RL methods for scalable, safe, and resource efficient space mission planning, paving the way for future advancements in ADR autonomy.

Related papers

Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation [50.406803870992974]
Plan-MCTS is a framework that reformulates web navigation by shifting exploration to a semantic Plan Space.<n>Plan-MCTS achieves state-of-the-art performance, surpassing current approaches with higher task effectiveness and search efficiency.
arXiv Detail & Related papers (2026-02-15T10:24:45Z)
Evaluating Robustness and Adaptability in Learning-Based Mission Planning for Active Debris Removal [22.261628532402067]
This work compares three planners for the constrained multi-debris rendezvous problem in Low Earth Orbit.<n> Evaluations are conducted in a high-fidelity orbital simulation with refueling, realistic transfer dynamics, and randomized debris fields.
arXiv Detail & Related papers (2026-02-04T22:22:40Z)
Optimizing Mission Planning for Multi-Debris Rendezvous Using Reinforcement Learning with Refueling and Adaptive Collision Avoidance [22.261628532402067]
This study presents a reinforcement learning based framework to enhance adaptive collision avoidance in active debris removal missions.<n>Small satellites are increasingly adopted due to their flexibility, cost effectiveness, and maneuverability, making them well suited for dynamic missions such as ADR.<n>The framework integrates refueling strategies, efficient mission planning, and adaptive collision avoidance to optimize spacecraft rendezvous operations.
arXiv Detail & Related papers (2026-02-04T21:49:20Z)
TLE-Based A2C Agent for Terrestrial Coverage Orbital Path Planning [0.0]
The congestion of Low Earth Orbit (LEO) poses persistent challenges to the efficient deployment and safe operation of Earth observation satellites.<n>This work presents a reinforcement learning framework using the Advantage Actor-Critic (A2C) algorithm to optimize satellite orbital parameters for precise terrestrial coverage.
arXiv Detail & Related papers (2025-08-14T17:44:51Z)
LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks [57.27815890269697]
This work focuses on maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under energy constraints.<n>We introduce a Large Language Model (LLM)-guided multi-agent learning approach.<n>Results show that our method outperforms existing baselines in secrecy and energy efficiency.
arXiv Detail & Related papers (2025-07-23T04:22:57Z)
AI-Driven Risk-Aware Scheduling for Active Debris Removal Missions [0.0]
Debris in Low Earth Orbit represents a significant threat to space sustainability and spacecraft safety. Armoured Transfer Vehicles (OTVs) facilitate debris deorbiting, thereby reducing future collision risks. Armoured decision-planning model based on Deep Reinforcement Learning (DRL) is developed to train an OTV to plan optimal debris removal sequencing. It is shown that using the proposed framework, the agent can find optimal mission plans and learn to update the planning autonomously to include risk handling of debris with high collision risk.
arXiv Detail & Related papers (2024-09-25T15:16:07Z)
Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous [15.699822139827916]
The aim is to optimize the sequence in which all the given debris should be visited to get the least total time for rendezvous for the entire mission. A neural network (NN) policy is developed, trained on simulated space missions with varying debris fields. The reinforcement learning approach demonstrates a significant improvement in planning efficiency.
arXiv Detail & Related papers (2024-09-25T12:50:01Z)
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO. This learning method is designed to enhance the performance of open LLM agents. Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z)
Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces. We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories. We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z)
Learning Space Partitions for Path Planning [54.475949279050596]
PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima. These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
arXiv Detail & Related papers (2021-06-19T18:06:11Z)
Reinforcement Learning for Low-Thrust Trajectory Design of Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances. An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted. The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.