Continuous Motion Planning with Temporal Logic Specifications using Deep
Neural Networks
- URL: http://arxiv.org/abs/2004.02610v2
- Date: Tue, 29 Sep 2020 19:18:54 GMT
- Title: Continuous Motion Planning with Temporal Logic Specifications using Deep
Neural Networks
- Authors: Chuanzheng Wang, Yinan Li, Stephen L. Smith, Jun Liu
- Abstract summary: We propose a model-free reinforcement learning method to synthesize control policies for motion planning problems.
The robot is modelled as a discrete Markovtime decision process (MDP) with continuous state and action spaces.
We train deep neural networks to approximate the value function and policy using an actorcritic reinforcement learning method.
- Score: 16.296473750342464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a model-free reinforcement learning method to
synthesize control policies for motion planning problems with continuous states
and actions. The robot is modelled as a labeled discrete-time Markov decision
process (MDP) with continuous state and action spaces. Linear temporal logics
(LTL) are used to specify high-level tasks. We then train deep neural networks
to approximate the value function and policy using an actor-critic
reinforcement learning method. The LTL specification is converted into an
annotated limit-deterministic B\"uchi automaton (LDBA) for continuously shaping
the reward so that dense rewards are available during training. A na\"ive way
of solving a motion planning problem with LTL specifications using
reinforcement learning is to sample a trajectory and then assign a high reward
for training if the trajectory satisfies the entire LTL formula. However, the
sampling complexity needed to find such a trajectory is too high when we have a
complex LTL formula for continuous state and action spaces. As a result, it is
very unlikely that we get enough reward for training if all sample trajectories
start from the initial state in the automata. In this paper, we propose a
method that samples not only an initial state from the state space, but also an
arbitrary state in the automata at the beginning of each training episode. We
test our algorithm in simulation using a car-like robot and find out that our
method can learn policies for different working configurations and LTL
specifications successfully.
Related papers
- Scaling Learning based Policy Optimization for Temporal Tasks via Dropout [4.421486904657393]
We introduce a model-based approach for training feedback controllers for an autonomous agent operating in a highly nonlinear environment.
We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent's task objectives.
We introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling.
arXiv Detail & Related papers (2024-03-23T12:53:51Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Large Language Models as General Pattern Machines [64.75501424160748]
We show that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences.
Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary.
In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics.
arXiv Detail & Related papers (2023-07-10T17:32:13Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Learning Minimally-Violating Continuous Control for Infeasible Linear
Temporal Logic Specifications [2.496282558123411]
This paper explores continuous-time control for target-driven navigation to satisfy complex high-level tasks expressed as linear temporal logic (LTL)
We propose a model-free synthesis framework using deep reinforcement learning (DRL) where the underlying dynamic system is unknown (an opaque box)
arXiv Detail & Related papers (2022-10-03T18:32:20Z) - Distributed Control using Reinforcement Learning with
Temporal-Logic-Based Reward Shaping [0.2320417845168326]
We present a framework for synthesis of distributed control strategies for a heterogeneous team of robots in a partially observable environment.
Our approach formulates the synthesis problem as a game and employs a policy graph method to find a control strategy with memory for each agent.
We use the quantitative semantics of TLTL as the reward of the game, and further reshape it using the finite state automaton to guide and accelerate the learning process.
arXiv Detail & Related papers (2022-03-08T16:03:35Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - Induction and Exploitation of Subgoal Automata for Reinforcement
Learning [75.55324974788475]
We present ISA, an approach for learning and exploiting subgoals in episodic reinforcement learning (RL) tasks.
ISA interleaves reinforcement learning with the induction of a subgoal automaton, an automaton whose edges are labeled by the task's subgoals.
A subgoal automaton also consists of two special states: a state indicating the successful completion of the task, and a state indicating that the task has finished without succeeding.
arXiv Detail & Related papers (2020-09-08T16:42:55Z) - Tractable Reinforcement Learning of Signal Temporal Logic Objectives [0.0]
Signal temporal logic (STL) is an expressive language to specify time-bound real-world robotic tasks and safety specifications.
Learning to satisfy STL specifications often needs a sufficient length of state history to compute reward and the next action.
We propose a compact means to capture state history in a new augmented state-space representation.
arXiv Detail & Related papers (2020-01-26T15:23:54Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.