Closed-loop deep learning: generating forward models with
back-propagation
- URL: http://arxiv.org/abs/2001.02970v2
- Date: Mon, 13 Jan 2020 11:14:24 GMT
- Title: Closed-loop deep learning: generating forward models with
back-propagation
- Authors: Sama Daryanavard, Bernd Porr
- Abstract summary: A reflex is a simple closed loop control approach which tries to minimise an error but fails to do so because it will always react too late.
An adaptive algorithm can use this error to learn a forward model with the help of predictive cues.
We show how this can be directly achieved by embedding deep learning into a closed loop system and preserving its continuous processing.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A reflex is a simple closed loop control approach which tries to minimise an
error but fails to do so because it will always react too late. An adaptive
algorithm can use this error to learn a forward model with the help of
predictive cues. For example a driver learns to improve their steering by
looking ahead to avoid steering in the last minute. In order to process complex
cues such as the road ahead deep learning is a natural choice. However, this is
usually only achieved indirectly by employing deep reinforcement learning
having a discrete state space. Here, we show how this can be directly achieved
by embedding deep learning into a closed loop system and preserving its
continuous processing. We show specifically how error back-propagation can be
achieved in z-space and in general how gradient based approaches can be
analysed in such closed loop scenarios. The performance of this learning
paradigm is demonstrated using a line-follower both in simulation and on a real
robot that show very fast and continuous learning.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Refining Pre-Trained Motion Models [56.18044168821188]
We take on the challenge of improving state-of-the-art supervised models with self-supervised training.
We focus on obtaining a "clean" training signal from real-world unlabelled video.
We show that our method yields reliable gains over fully-supervised methods in real videos.
arXiv Detail & Related papers (2024-01-01T18:59:33Z) - Prime and Modulate Learning: Generation of forward models with signed
back-propagation and environmental cues [0.0]
Deep neural networks employing error back-propagation for learning can suffer from exploding and vanishing gradient problems.
In this work we follow a different approach where back-propagation makes exclusive use of the sign of the error signal to prime the learning.
We present a mathematical derivation of the learning rule in z-space and demonstrate the real-time performance with a robotic platform.
arXiv Detail & Related papers (2023-09-07T16:34:30Z) - Can Direct Latent Model Learning Solve Linear Quadratic Gaussian
Control? [75.14973944905216]
We study the task of learning state representations from potentially high-dimensional observations.
We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning.
arXiv Detail & Related papers (2022-12-30T01:42:04Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Feedback Linearization of Car Dynamics for Racing via Reinforcement
Learning [0.0]
We seek to learn a linearizing controller to simplify the process of controlling a car to race autonomously.
A soft actor-critic approach is used to learn a decoupling matrix and drift vector that effectively correct for errors in a hand-designed linearizing controller.
To do so, we posit an extension to the method of learning feedback linearization; a neural network that is trained using supervised learning to convert the output of our linearizing controller to the required input for the racing environment.
arXiv Detail & Related papers (2021-10-20T09:11:18Z) - Autoencoder based Randomized Learning of Feedforward Neural Networks for
Regression [0.0]
gradient-based learning suffers from many drawbacks making the training process ineffective and time-consuming.
Alternative randomized learning does not use gradients but selects hidden node parameters randomly.
A recently proposed method uses autoencoders for unsupervised parameter learning.
arXiv Detail & Related papers (2021-07-04T19:07:39Z) - On the Theory of Reinforcement Learning with Once-per-Episode Feedback [120.5537226120512]
We introduce a theory of reinforcement learning in which the learner receives feedback only once at the end of an episode.
This is arguably more representative of real-world applications than the traditional requirement that the learner receive feedback at every time step.
arXiv Detail & Related papers (2021-05-29T19:48:51Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Episodic Self-Imitation Learning with Hindsight [7.743320290728377]
Episodic self-imitation learning is a novel self-imitation algorithm with a trajectory selection module and an adaptive loss function.
A selection module is introduced to filter uninformative samples from each episode of the update.
Episodic self-imitation learning has the potential to be applied to real-world problems that have continuous action spaces.
arXiv Detail & Related papers (2020-11-26T20:36:42Z) - Learning Navigation Costs from Demonstration with Semantic Observations [24.457042947946025]
This paper focuses on inverse reinforcement learning (IRL) for autonomous robot navigation using semantic observations.
We develop a map encoder, which infers semantic class probabilities from the observation sequence, and a cost encoder, defined as deep neural network over the semantic features.
We show that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of cars, sidewalks and road lanes.
arXiv Detail & Related papers (2020-06-09T04:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.