Related papers: Tractable Reinforcement Learning of Signal Temporal Logic Objectives

Tractable Reinforcement Learning of Signal Temporal Logic Objectives

URL: http://arxiv.org/abs/2001.09467v2
Date: Mon, 17 Feb 2020 15:17:50 GMT
Title: Tractable Reinforcement Learning of Signal Temporal Logic Objectives
Authors: Harish Venkataraman, Derya Aksaray, Peter Seiler
Abstract summary: Signal temporal logic (STL) is an expressive language to specify time-bound real-world robotic tasks and safety specifications. Learning to satisfy STL specifications often needs a sufficient length of state history to compute reward and the next action. We propose a compact means to capture state history in a new augmented state-space representation.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Signal temporal logic (STL) is an expressive language to specify time-bound real-world robotic tasks and safety specifications. Recently, there has been an interest in learning optimal policies to satisfy STL specifications via reinforcement learning (RL). Learning to satisfy STL specifications often needs a sufficient length of state history to compute reward and the next action. The need for history results in exponential state-space growth for the learning problem. Thus the learning problem becomes computationally intractable for most real-world applications. In this paper, we propose a compact means to capture state history in a new augmented state-space representation. An approximation to the objective (maximizing probability of satisfaction) is proposed and solved for in the new augmented state-space. We show the performance bound of the approximate solution and compare it with the solution of an existing technique via simulations.

Related papers

State Chrono Representation for Enhancing Generalization in Reinforcement Learning [36.12688166503104]
In reinforcement learning with image-based inputs, it is crucial to establish a robust and generalizable state representation. We propose a novel State Chrono Representation (SCR) approach to address these challenges. SCR augments state metric-based representations by incorporating extensive temporal information into the update step of bisimulation metric learning.
arXiv Detail & Related papers (2024-11-09T13:12:34Z)
DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL [59.01527054553122]
Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks. Existing approaches suffer from several shortcomings. We propose a novel learning approach to address these concerns.
arXiv Detail & Related papers (2024-10-06T21:30:38Z)
Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning. We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging. We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z)
The Power of Resets in Online Reinforcement Learning [73.64852266145387]
We explore the power of simulators through online reinforcement learning with local simulator access (or, local planning) We show that MDPs with low coverability can be learned in a sample-efficient fashion with only $Qstar$-realizability. We show that the notorious Exogenous Block MDP problem is tractable under local simulator access.
arXiv Detail & Related papers (2024-04-23T18:09:53Z)
State Sequences Prediction via Fourier Transform for Representation Learning [111.82376793413746]
We propose State Sequences Prediction via Fourier Transform (SPF), a novel method for learning expressive representations efficiently. We theoretically analyze the existence of structural information in state sequences, which is closely related to policy performance and signal regularity. Experiments demonstrate that the proposed method outperforms several state-of-the-art algorithms in terms of both sample efficiency and performance.
arXiv Detail & Related papers (2023-10-24T14:47:02Z)
Near-optimal Policy Identification in Active Reinforcement Learning [84.27592560211909]
AE-LSVI is a novel variant of the kernelized least-squares value RL (LSVI) algorithm that combines optimism with pessimism for active exploration. We show that AE-LSVI outperforms other algorithms in a variety of environments when robustness to the initial state is required.
arXiv Detail & Related papers (2022-12-19T14:46:57Z)
Funnel-based Reward Shaping for Signal Temporal Logic Tasks in Reinforcement Learning [0.0]
We propose a tractable reinforcement learning algorithm to learn a controller that enforces Signal Temporal Logic (STL) specifications. We demonstrate the utility of our approach on several STL tasks using different environments.
arXiv Detail & Related papers (2022-11-30T19:38:21Z)
Temporal Feature Alignment in Contrastive Self-Supervised Learning for Human Activity Recognition [2.2082422928825136]
Self-supervised learning is typically used to learn deep feature representations from unlabeled data. We propose integrating a dynamic time warping algorithm in a latent space to force features to be aligned in a temporal dimension. The proposed approach has a great potential in learning robust feature representations compared to the recent SSL baselines.
arXiv Detail & Related papers (2022-10-07T07:51:01Z)
Learning Signal Temporal Logic through Neural Network for Interpretable Classification [13.829082181692872]
We propose an explainable neural-symbolic framework for the classification of time-series behaviors. We demonstrate the computational efficiency, compactness, and interpretability of the proposed method through driving scenarios and naval surveillance case studies.
arXiv Detail & Related papers (2022-10-04T21:11:54Z)
AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS) Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)
Continuous Motion Planning with Temporal Logic Specifications using Deep Neural Networks [16.296473750342464]
We propose a model-free reinforcement learning method to synthesize control policies for motion planning problems. The robot is modelled as a discrete Markovtime decision process (MDP) with continuous state and action spaces. We train deep neural networks to approximate the value function and policy using an actorcritic reinforcement learning method.
arXiv Detail & Related papers (2020-04-02T17:58:03Z)
Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs) The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.