Synthesis from Satisficing and Temporal Goals
- URL: http://arxiv.org/abs/2205.10464v1
- Date: Fri, 20 May 2022 23:46:31 GMT
- Title: Synthesis from Satisficing and Temporal Goals
- Authors: Suguman Bansal, Lydia Kavraki, Moshe Y. Vardi, Andrew Wells
- Abstract summary: An existing approach combines synthesis techniques from discounted synthesis with optimization for the DS rewards but has failed to yield a sound algorithm.
An alternative approach combining synthesis with satisficing DS rewards (rewards that achieve a threshold) is sound and complete for integer discount factors, but, in practice, a fractional discount factor is desired.
This work extends the existing satising approach to the first sound algorithm for synthesis from presenting DS rewards with fractional discount factors.
- Score: 21.14507575500482
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reactive synthesis from high-level specifications that combine hard
constraints expressed in Linear Temporal Logic LTL with soft constraints
expressed by discounted-sum (DS) rewards has applications in planning and
reinforcement learning. An existing approach combines techniques from LTL
synthesis with optimization for the DS rewards but has failed to yield a sound
algorithm. An alternative approach combining LTL synthesis with satisficing DS
rewards (rewards that achieve a threshold) is sound and complete for integer
discount factors, but, in practice, a fractional discount factor is desired.
This work extends the existing satisficing approach, presenting the first sound
algorithm for synthesis from LTL and DS rewards with fractional discount
factors. The utility of our algorithm is demonstrated on robotic planning
domains.
Related papers
- Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning.
We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging.
We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z) - On-the-fly Synthesis for LTL over Finite Traces: An Efficient Approach that Counts [20.14001970300658]
We present an on-the-fly framework for Linear Temporal Logic over finite traces (LTLf) based on top-down deterministic automata construction.
arXiv Detail & Related papers (2024-08-14T06:52:58Z) - Guiding Enumerative Program Synthesis with Large Language Models [15.500250058226474]
In this paper, we evaluate the abilities of Large Language Models to solve formal synthesis benchmarks.
When one-shot synthesis fails, we propose a novel enumerative synthesis algorithm.
We find that GPT-3.5 as a stand-alone tool for formal synthesis is easily outperformed by state-of-the-art formal synthesis algorithms.
arXiv Detail & Related papers (2024-03-06T19:13:53Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Optimization-based Block Coordinate Gradient Coding for Mitigating
Partial Stragglers in Distributed Learning [58.91954425047425]
This paper aims to design a new gradient coding scheme for mitigating partial stragglers in distributed learning.
We propose a gradient coordinate coding scheme with L coding parameters representing L possibly different diversities for the L coordinates, which generates most gradient coding schemes.
arXiv Detail & Related papers (2022-06-06T09:25:40Z) - Semantic Probabilistic Layers for Neuro-Symbolic Learning [83.25785999205932]
We design a predictive layer for structured-output prediction (SOP)
It can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints.
Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space.
arXiv Detail & Related papers (2022-06-01T12:02:38Z) - Distributed Control using Reinforcement Learning with
Temporal-Logic-Based Reward Shaping [0.2320417845168326]
We present a framework for synthesis of distributed control strategies for a heterogeneous team of robots in a partially observable environment.
Our approach formulates the synthesis problem as a game and employs a policy graph method to find a control strategy with memory for each agent.
We use the quantitative semantics of TLTL as the reward of the game, and further reshape it using the finite state automaton to guide and accelerate the learning process.
arXiv Detail & Related papers (2022-03-08T16:03:35Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Towards Neural-Guided Program Synthesis for Linear Temporal Logic
Specifications [26.547133495699093]
We use a neural network to learn a Q-function that is then used to guide search, and to construct programs that are subsequently verified for correctness.
Our method is unique in combining search with deep learning to realize synthesis.
arXiv Detail & Related papers (2019-12-31T17:09:49Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.