Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous
- URL: http://arxiv.org/abs/2409.16882v1
- Date: Wed, 25 Sep 2024 12:50:01 GMT
- Title: Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous
- Authors: Agni Bandyopadhyay, Guenther Waxenegger-Wilfing,
- Abstract summary: The aim is to optimize the sequence in which all the given debris should be visited to get the least total time for rendezvous for the entire mission.
A neural network (NN) policy is developed, trained on simulated space missions with varying debris fields.
The reinforcement learning approach demonstrates a significant improvement in planning efficiency.
- Score: 15.699822139827916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This research introduces a novel application of a masked Proximal Policy Optimization (PPO) algorithm from the field of deep reinforcement learning (RL), for determining the most efficient sequence of space debris visitation, utilizing the Lambert solver as per Izzo's adaptation for individual rendezvous. The aim is to optimize the sequence in which all the given debris should be visited to get the least total time for rendezvous for the entire mission. A neural network (NN) policy is developed, trained on simulated space missions with varying debris fields. After training, the neural network calculates approximately optimal paths using Izzo's adaptation of Lambert maneuvers. Performance is evaluated against standard heuristics in mission planning. The reinforcement learning approach demonstrates a significant improvement in planning efficiency by optimizing the sequence for debris rendezvous, reducing the total mission time by an average of approximately {10.96\%} and {13.66\%} compared to the Genetic and Greedy algorithms, respectively. The model on average identifies the most time-efficient sequence for debris visitation across various simulated scenarios with the fastest computational speed. This approach signifies a step forward in enhancing mission planning strategies for space debris clearance.
Related papers
- Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time
Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy.
Many algorithms for IRL have an inherently nested structure.
We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z) - Large region targets observation scheduling by multiple satellites using
resampling particle swarm optimization [0.3324876873771104]
The last decades have witnessed a rapid increase of Earth observation satellites (EOSs)
This paper aims to address the EOSs observation scheduling problem for large region targets.
arXiv Detail & Related papers (2022-06-21T08:18:02Z) - GEO satellites on-orbit repairing mission planning with mission deadline
constraint using a large neighborhood search-genetic algorithm [2.106508530625051]
This paper proposes a large neighborhood search-adaptive genetic algorithm (LNS-AGA) for many-to-many on-orbit repairing mission planning.
In the many-to-many on-orbit repairing scenario, several servicing spacecrafts and target satellites are located in GEO orbits which have different inclination, RAAN and true anomaly.
The mission objective is to find the optimal servicing sequence and orbit rendezvous time of every servicing spacecraft to minimize total cost of all servicing spacecrafts with all target satellites repaired.
arXiv Detail & Related papers (2021-10-08T03:33:37Z) - Learning Space Partitions for Path Planning [54.475949279050596]
PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima.
These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
arXiv Detail & Related papers (2021-06-19T18:06:11Z) - Safety-enhanced UAV Path Planning with Spherical Vector-based Particle
Swarm Optimization [5.076419064097734]
This paper presents a new algorithm named spherical vector-based particle swarm optimization (SPSO) to deal with the problem of path planning for unmanned aerial vehicles (UAVs)
A cost function is first formulated to convert the path planning into an optimization problem that incorporates requirements and constraints for the feasible and safe operation of the UAV.
SPSO is then used to find the optimal path that minimizes the cost function by efficiently searching the configuration space of the UAV.
arXiv Detail & Related papers (2021-04-13T06:45:11Z) - Autonomous Drone Racing with Deep Reinforcement Learning [39.757652701917166]
In many robotic tasks, such as drone racing, the goal is to travel through a set of waypoints as fast as possible.
A key challenge is planning the minimum-time trajectory, which is typically solved by assuming perfect knowledge of the waypoints to pass in advance.
In this work, a new approach to minimum-time trajectory generation for quadrotors is presented.
arXiv Detail & Related papers (2021-03-15T18:05:49Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - MLE-guided parameter search for task loss minimization in neural
sequence modeling [83.83249536279239]
Neural autoregressive sequence models are used to generate sequences in a variety of natural language processing (NLP) tasks.
We propose maximum likelihood guided parameter search (MGS), which samples from a distribution over update directions that is a mixture of random search around the current parameters and around the maximum likelihood gradient.
Our experiments show that MGS is capable of optimizing sequence-level losses, with substantial reductions in repetition and non-termination in sequence completion, and similar improvements to those of minimum risk training in machine translation.
arXiv Detail & Related papers (2020-06-04T22:21:22Z) - Congestion-aware Evacuation Routing using Augmented Reality Devices [96.68280427555808]
We present a congestion-aware routing solution for indoor evacuation, which produces real-time individual-customized evacuation routes among multiple destinations.
A population density map, obtained on-the-fly by aggregating locations of evacuees from user-end Augmented Reality (AR) devices, is used to model the congestion distribution inside a building.
arXiv Detail & Related papers (2020-04-25T22:54:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.