Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions
- URL: http://arxiv.org/abs/2008.08501v1
- Date: Wed, 19 Aug 2020 15:22:15 GMT
- Title: Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions
- Authors: Alessandro Zavoli and Lorenzo Federici
- Abstract summary: This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the use of Reinforcement Learning for the robust
design of low-thrust interplanetary trajectories in presence of severe
disturbances, modeled alternatively as Gaussian additive process noise,
observation noise, control actuation errors on thrust magnitude and direction,
and possibly multiple missed thrust events. The optimal control problem is
recast as a time-discrete Markov Decision Process to comply with the standard
formulation of reinforcement learning. An open-source implementation of the
state-of-the-art algorithm Proximal Policy Optimization is adopted to carry out
the training process of a deep neural network, used to map the spacecraft
(observed) states to the optimal control policy. The resulting Guidance and
Control Network provides both a robust nominal trajectory and the associated
closed-loop guidance law. Numerical results are presented for a typical
Earth-Mars mission. First, in order to validate the proposed approach, the
solution found in a (deterministic) unperturbed scenario is compared with the
optimal one provided by an indirect technique. Then, the robustness and
optimality of the obtained closed-loop guidance laws is assessed by means of
Monte Carlo campaigns performed in the considered uncertain scenarios. These
preliminary results open up new horizons for the use of reinforcement learning
in the robust design of interplanetary missions.
Related papers
- Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous [15.699822139827916]
The aim is to optimize the sequence in which all the given debris should be visited to get the least total time for rendezvous for the entire mission.
A neural network (NN) policy is developed, trained on simulated space missions with varying debris fields.
The reinforcement learning approach demonstrates a significant improvement in planning efficiency.
arXiv Detail & Related papers (2024-09-25T12:50:01Z) - Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.
In common practice, convergence (hyper)policies are learned only to deploy their deterministic version.
We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z) - Adaptive trajectory-constrained exploration strategy for deep
reinforcement learning [6.589742080994319]
Deep reinforcement learning (DRL) faces significant challenges in addressing the hard-exploration problems in tasks with sparse or deceptive rewards and large state spaces.
We propose an efficient adaptive trajectory-constrained exploration strategy for DRL.
We conduct experiments on two large 2D grid world mazes and several MuJoCo tasks.
arXiv Detail & Related papers (2023-12-27T07:57:15Z) - Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking [4.9653656404010205]
We introduce a novel Bayesian actor-critic reinforcement learning algorithm to learn a control policy with the stability guarantee.
The proposed algorithm has been experimentally evaluated on a spacecraft air-bearing testbed and shows impressive and promising performance.
arXiv Detail & Related papers (2023-11-07T03:12:58Z) - Low-Thrust Orbital Transfer using Dynamics-Agnostic Reinforcement
Learning [0.0]
This study uses model-free Reinforcement Learning to train an agent on a constrained pericenter raising scenario for a low-thrust medium-Earth-orbit satellite.
The trained agent is then used to design a trajectory and to autonomously control the satellite during the cruise.
arXiv Detail & Related papers (2022-10-06T08:36:35Z) - Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions.
For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions.
The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z) - Motion Planning for Autonomous Vehicles in the Presence of Uncertainty
Using Reinforcement Learning [0.0]
Motion planning under uncertainty is one of the main challenges in developing autonomous driving vehicles.
We propose a reinforcement learning based solution to manage uncertainty by optimizing for the worst case outcome.
The proposed approach yields much better motion planning behavior compared to conventional RL algorithms and behaves comparably to humans driving style.
arXiv Detail & Related papers (2021-10-01T20:32:25Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Localized active learning of Gaussian process state space models [63.97366815968177]
A globally accurate model is not required to achieve good performance in many common control applications.
We propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space.
By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy.
arXiv Detail & Related papers (2020-05-04T05:35:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.