Hybrid of representation learning and reinforcement learning for dynamic
and complex robotic motion planning
- URL: http://arxiv.org/abs/2309.03758v1
- Date: Thu, 7 Sep 2023 15:00:49 GMT
- Title: Hybrid of representation learning and reinforcement learning for dynamic
and complex robotic motion planning
- Authors: Chengmin Zhou, Xin Lu, Jiapeng Dai, Bingding Huang, Xiaoxu Liu, and
Pasi Fr\"anti
- Abstract summary: This paper introduces a hybrid algorithm for robotic motion planning: long short-term memory (LSTM) pooling and skip connection for attention-based discrete soft actor critic (LSA-DSAC)
Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations.
- Score: 3.794762046318001
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Motion planning is the soul of robot decision making. Classical planning
algorithms like graph search and reaction-based algorithms face challenges in
cases of dense and dynamic obstacles. Deep learning algorithms generate
suboptimal one-step predictions that cause many collisions. Reinforcement
learning algorithms generate optimal or near-optimal time-sequential
predictions. However, they suffer from slow convergence, suboptimal converged
results, and overfittings. This paper introduces a hybrid algorithm for robotic
motion planning: long short-term memory (LSTM) pooling and skip connection for
attention-based discrete soft actor critic (LSA-DSAC). First, graph network
(relational graph) and attention network (attention weight) interpret the
environmental state for the learning of the discrete soft actor critic
algorithm. The expressive power of attention network outperforms that of graph
in our task by difference analysis of these two representation methods.
However, attention based DSAC faces the overfitting problem in training.
Second, the skip connection method is integrated to attention based DSAC to
mitigate overfitting and improve convergence speed. Third, LSTM pooling is
taken to replace the sum operator of attention weigh and eliminate overfitting
by slightly sacrificing convergence speed at early-stage training. Experiments
show that LSA-DSAC outperforms the state-of-the-art in training and most
evaluations. The physical robot is also implemented and tested in the real
world.
Related papers
- Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - MASS: Mobility-Aware Sensor Scheduling of Cooperative Perception for
Connected Automated Driving [19.66714697653504]
A new paradigm, Cooperative Perception (CP), comes to the rescue by sharing sensor data from a cooperative vehicle (CoV)
Existing methods rely on the exchange of meta-information, such as visibility maps, to predict the perception gains from nearby vehicles.
We propose a new approach, learning while scheduling, for distributed scheduling of CP.
The proposed MASS algorithm achieves the best average perception gain and improves recall by up to 4.2 percentage points compared to other learning-based algorithms.
arXiv Detail & Related papers (2023-02-25T09:03:05Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - An advantage actor-critic algorithm for robotic motion planning in dense
and dynamic scenarios [0.8594140167290099]
In this paper, we modify existing advantage actor-critic algorithm and suit it to complex motion planning.
It achieves higher success rate in motion planning with lesser processing time for robot to reach its goal.
arXiv Detail & Related papers (2021-02-05T12:30:23Z) - A review of motion planning algorithms for intelligent robotics [0.8594140167290099]
We investigate and analyze principles of typical motion planning algorithms.
Traditional planning algorithms include graph search algorithms, sampling-based algorithms, and interpolating curve algorithms.
Supervised learning algorithms include MSVM, LSTM, MCTS and CNN.
Policy gradient algorithms include policy gradient method, actor-critic algorithm, A3C, A2C, DPG, DDPG, TRPO and PPO.
arXiv Detail & Related papers (2021-02-04T02:24:04Z) - Phase Retrieval using Expectation Consistent Signal Recovery Algorithm
based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems.
Recent advances in deep learning have opened up a new possibility for robust and fast PR.
We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z) - Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling [8.14784681248878]
In this paper, we propose a reinforcement learning approach to solve a realistic scheduling problem.
We apply it to an algorithm commonly executed in the high performance computing community, the Cholesky factorization.
Our algorithm uses graph neural networks in combination with an actor-critic algorithm (A2C) to build an adaptive representation of the problem on the fly.
arXiv Detail & Related papers (2020-11-09T10:57:21Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Robust Deep Learning as Optimal Control: Insights and Convergence
Guarantees [19.28405674700399]
adversarial examples during training is a popular defense mechanism against adversarial attacks.
By interpreting the min-max problem as an optimal control problem, it has been shown that one can exploit the compositional structure of neural networks.
We provide the first convergence analysis of this adversarial training algorithm by combining techniques from robust optimal control and inexact methods in optimization.
arXiv Detail & Related papers (2020-05-01T21:26:38Z) - Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.