Guiding LLM Temporal Logic Generation with Explicit Separation of Data and Control
- URL: http://arxiv.org/abs/2406.07400v1
- Date: Tue, 11 Jun 2024 16:07:24 GMT
- Title: Guiding LLM Temporal Logic Generation with Explicit Separation of Data and Control
- Authors: William Murphy, Nikolaus Holzer, Nathan Koenig, Leyi Cui, Raven Rothkopf, Feitong Qiao, Mark Santolucito,
- Abstract summary: Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems.
Recent progress on Large Language Models has the potential to make the process of writing such specifications more accessible.
- Score: 0.7580487359358722
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems. The recent progress on Large Language Models (LLMs) has the potential to make the process of writing such specifications more accessible. However, writing specifications in temporal logics remains challenging for all but the most expert users. A key question in using LLMs for temporal logic specification engineering is to understand what kind of guidance is most helpful to the LLM and the users to easily produce specifications. Looking specifically at the problem of reactive program synthesis, we explore the impact of providing an LLM with guidance on the separation of control and data--making explicit for the LLM what functionality is relevant for the specification, and treating the remaining functionality as an implementation detail for a series of pre-defined functions and predicates. We present a benchmark set and find that this separation of concerns improves specification generation. Our benchmark provides a test set against which to verify future work in LLM generation of temporal logic specifications.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning [20.066249913943405]
Large language models (LLMs) have showcased remarkable reasoning capabilities, yet they remain susceptible to errors.
We introduce novel synthetic datasets specifically designed to assess LLM temporal reasoning abilities in various scenarios.
Our findings provide valuable insights into the strengths and weaknesses of current LLMs in temporal reasoning tasks.
arXiv Detail & Related papers (2024-06-13T14:31:19Z) - CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities.
We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution.
We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z) - Temporal Blind Spots in Large Language Models [20.631107338678234]
Large language models (LLMs) have recently gained significant attention due to their unparalleled ability to perform various natural language processing tasks.
This study investigates the underlying limitations of general-purpose LLMs when deployed for tasks that require a temporal understanding.
arXiv Detail & Related papers (2024-01-22T16:20:14Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - When does In-context Learning Fall Short and Why? A Study on
Specification-Heavy Tasks [54.71034943526973]
In-context learning (ICL) has become the default method for using large language models (LLMs)
We find that ICL falls short of handling specification-heavy tasks, which are tasks with complicated and extensive task specifications.
We identify three primary reasons: inability to specifically understand context, misalignment in task schema comprehension with humans, and inadequate long-text understanding ability.
arXiv Detail & Related papers (2023-11-15T14:26:30Z) - How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection [39.254432080406346]
Even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to have a large variance in detection performance.
Our experiments show that the standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that by generating texts multiple times or paraphrasing the instruction.
arXiv Detail & Related papers (2023-11-14T18:32:52Z) - Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation
with Large Language Models [12.708117108874083]
Large Language Models (LLMs) generate code snippets given natural language intents in zero-shot, i.e., without the need for specific fine-tuning.
Previous research explored In-Context Learning (ICL) as a strategy to guide the LLM generative process with task-specific prompt examples.
In this paper, we deliver a comprehensive study of.
PEFT techniques for LLMs under the automated code generation scenario.
arXiv Detail & Related papers (2023-08-21T04:31:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.