Reinforcement Meta-Learning for Interception of Maneuvering
Exoatmospheric Targets with Parasitic Attitude Loop
- URL: http://arxiv.org/abs/2004.09978v1
- Date: Sat, 18 Apr 2020 21:20:59 GMT
- Title: Reinforcement Meta-Learning for Interception of Maneuvering
Exoatmospheric Targets with Parasitic Attitude Loop
- Authors: Brian Gaudet, Roberto Furfaro, Richard Linares, Andrea Scorsoglio
- Abstract summary: We use Reinforcement Meta-Learning to optimize an adaptive integrated guidance, navigation, and control system suitable for exoatmospheric interception of a maneuvering target.
The system maps observations consisting of strapdown seeker angles and rate gyro measurements directly to thruster on-off commands.
We demonstrate that the optimized policy can adapt to parasitic effects including seeker angle measurement lag, thruster control lag, the parasitic attitude loop resulting from scale factor errors and Gaussian noise on angle and rotational velocity measurements.
- Score: 1.7663909228482466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We use Reinforcement Meta-Learning to optimize an adaptive integrated
guidance, navigation, and control system suitable for exoatmospheric
interception of a maneuvering target. The system maps observations consisting
of strapdown seeker angles and rate gyro measurements directly to thruster
on-off commands. Using a high fidelity six degree-of-freedom simulator, we
demonstrate that the optimized policy can adapt to parasitic effects including
seeker angle measurement lag, thruster control lag, the parasitic attitude loop
resulting from scale factor errors and Gaussian noise on angle and rotational
velocity measurements, and a time varying center of mass caused by fuel
consumption and slosh. Importantly, the optimized policy gives good performance
over a wide range of challenging target maneuvers. Unlike previous work that
enhances range observability by inducing line of sight oscillations, our system
is optimized to use only measurements available from the seeker and rate gyros.
Through extensive Monte Carlo simulation of randomized exoatmospheric
interception scenarios, we demonstrate that the optimized policy gives
performance close to that of augmented proportional navigation with perfect
knowledge of the full engagement state. The optimized system is computationally
efficient and requires minimal memory, and should be compatible with today's
flight processors.
Related papers
- MPVO: Motion-Prior based Visual Odometry for PointGoal Navigation [3.9974562667271507]
Visual odometry (VO) is essential for enabling accurate point-goal navigation of embodied agents in indoor environments.
Recent deep-learned VO methods show robust performance but suffer from sample inefficiency during training.
We propose a robust and sample-efficient VO pipeline based on motion priors available while an agent is navigating an environment.
arXiv Detail & Related papers (2024-11-07T15:36:49Z) - Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied
Scenarios [66.05091704671503]
We present a novel angle navigation paradigm to deal with flight deviation in point-to-point navigation tasks.
We also propose a model that includes the Adaptive Feature Enhance Module, Cross-knowledge Attention-guided Module and Robust Task-oriented Head Module.
arXiv Detail & Related papers (2024-02-04T08:41:20Z) - Globally Optimal Event-Based Divergence Estimation for Ventral Landing [55.29096494880328]
Event sensing is a major component in bio-inspired flight guidance and control systems.
We explore the usage of event cameras for predicting time-to-contact with the surface during ventral landing.
This is achieved by estimating divergence (inverse TTC), which is the rate of radial optic flow, from the event stream generated during landing.
Our core contributions are a novel contrast maximisation formulation for event-based divergence estimation, and a branch-and-bound algorithm to exactly maximise contrast and find the optimal divergence value.
arXiv Detail & Related papers (2022-09-27T06:00:52Z) - Adaptive Approach Phase Guidance for a Hypersonic Glider via
Reinforcement Meta Learning [0.0]
Adaptability is achieved by optimizing over a range of off-nominal flight conditions.
System maps observations directly to commanded bank angle and angle of attack rates.
Minimizing the tracking error keeps the curved space line of sight to the target location aligned with the vehicle's velocity vector.
arXiv Detail & Related papers (2021-07-30T17:14:52Z) - Optimal control of point-to-point navigation in turbulent time-dependent
flows using Reinforcement Learning [0.0]
We present theoretical and numerical results concerning the problem to find the path that minimizes time to navigate between two given points in a complex fluid.
We show that ActorCritic algorithms are able to find quasi-optimal solutions in the presence of either time-independent or chaotically evolving flow configurations.
arXiv Detail & Related papers (2021-02-27T21:31:18Z) - Safe and Efficient Model-free Adaptive Control via Bayesian Optimization [39.962395119933596]
We propose a purely data-driven, model-free approach for adaptive control.
tuning low-level controllers based solely on system data raises concerns on the underlying algorithm safety and computational performance.
We numerically demonstrate for several types of disturbances that our approach is sample efficient, outperforms constrained Bayesian optimization in terms of safety, and achieves the performance optima computed by grid evaluation.
arXiv Detail & Related papers (2021-01-19T19:15:00Z) - Pushing the Envelope of Rotation Averaging for Visual SLAM [69.7375052440794]
We propose a novel optimization backbone for visual SLAM systems.
We leverage averaging to improve the accuracy, efficiency and robustness of conventional monocular SLAM systems.
Our approach can exhibit up to 10x faster with comparable accuracy against the state-art on public benchmarks.
arXiv Detail & Related papers (2020-11-02T18:02:26Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence.
First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN.
Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z) - Real-Time Optimal Guidance and Control for Interplanetary Transfers
Using Deep Networks [10.191757341020216]
Imitation learning of optimal examples is used as a network training paradigm.
G&CNETs are suitable for an on-board, real-time, implementation of the optimal guidance and control system of the spacecraft.
arXiv Detail & Related papers (2020-02-20T23:37:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.