Seeking Specifications: The Case for Neuro-Symbolic Specification Synthesis
- URL: http://arxiv.org/abs/2504.21061v1
- Date: Tue, 29 Apr 2025 10:02:29 GMT
- Title: Seeking Specifications: The Case for Neuro-Symbolic Specification Synthesis
- Authors: George Granberry, Wolfgang Ahrendt, Moa Johansson,
- Abstract summary: This work is concerned with the generation of formal specifications from code, using Large Language Models (LLMs) in combination with symbolic methods.<n>For the first, we investigate how the absence or presence of bugs in the code impacts the generated specifications.<n>For the second, we investigate the impact of results from symbolic analyses on the specifications generated by the LLM.
- Score: 1.2152813244704233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work is concerned with the generation of formal specifications from code, using Large Language Models (LLMs) in combination with symbolic methods. Concretely, in our study, the programming language is C, the specification language is ACSL, and the LLM is Deepseek-R1. In this context, we address two research directions, namely the specification of intent vs. implementation on the one hand, and the combination of symbolic analyses with LLMs on the other hand. For the first, we investigate how the absence or presence of bugs in the code impacts the generated specifications, as well as whether and how a user can direct the LLM to specify intent or implementation, respectively. For the second, we investigate the impact of results from symbolic analyses on the specifications generated by the LLM. The LLM prompts are augmented with outputs from two formal methods tools in the Frama-C ecosystem, Pathcrawler and EVA. We demonstrate how the addition of symbolic analysis to the workflow impacts the quality of annotations.
Related papers
- Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering [66.5524727179286]
NOVA is a framework designed to identify high-quality data that aligns well with the learned knowledge to reduce hallucinations.
It includes Internal Consistency Probing (ICP) and Semantic Equivalence Identification (SEI) to measure how familiar the LLM is with instruction data.
To ensure the quality of selected samples, we introduce an expert-aligned reward model, considering characteristics beyond just familiarity.
arXiv Detail & Related papers (2025-02-11T08:05:56Z) - SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs [77.79172008184415]
SpecTool is a new benchmark to identify error patterns in LLM output on tool-use tasks.
We show that even the most prominent LLMs exhibit these error patterns in their outputs.
Researchers can use the analysis and insights from SPECTOOL to guide their error mitigation strategies.
arXiv Detail & Related papers (2024-11-20T18:56:22Z) - Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability.
The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts.
As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z) - Specify What? Enhancing Neural Specification Synthesis by Symbolic Methods [1.2152813244704233]
We investigate how combinations of Large Language Models and symbolic analyses can be used to synthesise specifications of C programs.
The method infers rather the programs intent than its behaviour, by generating specifications for buggy programs and observing robustness of the result against bugs.
arXiv Detail & Related papers (2024-06-21T17:39:57Z) - Guiding LLM Temporal Logic Generation with Explicit Separation of Data and Control [0.7580487359358722]
Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems.
Recent progress on Large Language Models has the potential to make the process of writing such specifications more accessible.
arXiv Detail & Related papers (2024-06-11T16:07:24Z) - CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities.
We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution.
We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z) - Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks [54.153914606302486]
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs)
We propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering.
arXiv Detail & Related papers (2023-11-03T14:39:20Z) - ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases.
We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets.
Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z) - Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition [23.172469312225694]
We propose to utilize an instruction-tuned large language model (LLM) for guiding the text generation process in automatic speech recognition (ASR)<n>The proposed model is built on the joint CTC and attention architecture, with the LLM serving as a front-end feature extractor for the decoder.<n> Experimental results show that the proposed LLM-guided model achieves a relative gain of approximately 13% in word error rates across major benchmarks.
arXiv Detail & Related papers (2023-09-19T11:10:50Z) - Label Words are Anchors: An Information Flow Perspective for
Understanding In-Context Learning [77.7070536959126]
In-context learning (ICL) emerges as a promising capability of large language models (LLMs)
In this paper, we investigate the working mechanism of ICL through an information flow lens.
We introduce an anchor re-weighting method to improve ICL performance, a demonstration compression technique to expedite inference, and an analysis framework for diagnosing ICL errors in GPT2-XL.
arXiv Detail & Related papers (2023-05-23T15:26:20Z) - LMs: Understanding Code Syntax and Semantics for Code Analysis [25.508254718438636]
We evaluate the capabilities of large language models (LLMs) and their limitations for code analysis in software engineering.
We employ four state-of-the-art foundational models, GPT4, GPT3.5, StarCoder and CodeLlama-13b-instruct.
arXiv Detail & Related papers (2023-05-20T08:43:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.