Related papers: Seeking Specifications: The Case for Neuro-Symbolic Specification Synthesis

Seeking Specifications: The Case for Neuro-Symbolic Specification Synthesis

URL: http://arxiv.org/abs/2504.21061v1
Date: Tue, 29 Apr 2025 10:02:29 GMT
Title: Seeking Specifications: The Case for Neuro-Symbolic Specification Synthesis
Authors: George Granberry, Wolfgang Ahrendt, Moa Johansson,
Abstract summary: This work is concerned with the generation of formal specifications from code, using Large Language Models (LLMs) in combination with symbolic methods.<n>For the first, we investigate how the absence or presence of bugs in the code impacts the generated specifications.<n>For the second, we investigate the impact of results from symbolic analyses on the specifications generated by the LLM.
Score: 1.2152813244704233
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work is concerned with the generation of formal specifications from code, using Large Language Models (LLMs) in combination with symbolic methods. Concretely, in our study, the programming language is C, the specification language is ACSL, and the LLM is Deepseek-R1. In this context, we address two research directions, namely the specification of intent vs. implementation on the one hand, and the combination of symbolic analyses with LLMs on the other hand. For the first, we investigate how the absence or presence of bugs in the code impacts the generated specifications, as well as whether and how a user can direct the LLM to specify intent or implementation, respectively. For the second, we investigate the impact of results from symbolic analyses on the specifications generated by the LLM. The LLM prompts are augmented with outputs from two formal methods tools in the Frama-C ecosystem, Pathcrawler and EVA. We demonstrate how the addition of symbolic analysis to the workflow impacts the quality of annotations.

Related papers

Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders [0.0]
We present a novel SAE-based architecture tailored for text classification.<n>We benchmark this architecture against established methods such as ConceptShap, Independent Component Analysis, and other SAE-based concept extraction techniques.<n>Our empirical results show that our architecture improves both the causality and interpretability of the extracted features.
arXiv Detail & Related papers (2025-06-30T15:18:50Z)
Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs)<n>We find that fine-tuning text embedding models on LLM-generated texts yields excellent classification accuracy.<n>We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z)
Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering [66.5524727179286]
NOVA is a framework designed to identify high-quality data that aligns well with the learned knowledge to reduce hallucinations. It includes Internal Consistency Probing (ICP) and Semantic Equivalence Identification (SEI) to measure how familiar the LLM is with instruction data. To ensure the quality of selected samples, we introduce an expert-aligned reward model, considering characteristics beyond just familiarity.
arXiv Detail & Related papers (2025-02-11T08:05:56Z)
SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs [77.79172008184415]
SpecTool is a new benchmark to identify error patterns in LLM output on tool-use tasks. We show that even the most prominent LLMs exhibit these error patterns in their outputs. Researchers can use the analysis and insights from SPECTOOL to guide their error mitigation strategies.
arXiv Detail & Related papers (2024-11-20T18:56:22Z)
Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability. The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
Specify What? Enhancing Neural Specification Synthesis by Symbolic Methods [1.2152813244704233]
We investigate how combinations of Large Language Models and symbolic analyses can be used to synthesise specifications of C programs. The method infers rather the programs intent than its behaviour, by generating specifications for buggy programs and observing robustness of the result against bugs.
arXiv Detail & Related papers (2024-06-21T17:39:57Z)
Guiding LLM Temporal Logic Generation with Explicit Separation of Data and Control [0.7580487359358722]
Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems. Recent progress on Large Language Models has the potential to make the process of writing such specifications more accessible.
arXiv Detail & Related papers (2024-06-11T16:07:24Z)
CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities. We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution. We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z)
DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation [57.07295906718989]
Constrained decoding approaches aim to control the meaning or style of text generated by pre-trained large language (Ms also PLMs) for various tasks at inference time.<n>These methods often guide plausible continuations by greedily and explicitly selecting targets.<n>Inspired by cognitive dual-process theory, we propose a novel decoding framework DECIDER.
arXiv Detail & Related papers (2024-03-04T11:49:08Z)
Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks [54.153914606302486]
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs) We propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering.
arXiv Detail & Related papers (2023-11-03T14:39:20Z)
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases. We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets. Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z)
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition [23.172469312225694]
We propose to utilize an instruction-tuned large language model (LLM) for guiding the text generation process in automatic speech recognition (ASR)<n>The proposed model is built on the joint CTC and attention architecture, with the LLM serving as a front-end feature extractor for the decoder.<n> Experimental results show that the proposed LLM-guided model achieves a relative gain of approximately 13% in word error rates across major benchmarks.
arXiv Detail & Related papers (2023-09-19T11:10:50Z)
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning [77.7070536959126]
In-context learning (ICL) emerges as a promising capability of large language models (LLMs) In this paper, we investigate the working mechanism of ICL through an information flow lens. We introduce an anchor re-weighting method to improve ICL performance, a demonstration compression technique to expedite inference, and an analysis framework for diagnosing ICL errors in GPT2-XL.
arXiv Detail & Related papers (2023-05-23T15:26:20Z)
LMs: Understanding Code Syntax and Semantics for Code Analysis [25.508254718438636]
We evaluate the capabilities of large language models (LLMs) and their limitations for code analysis in software engineering. We employ four state-of-the-art foundational models, GPT4, GPT3.5, StarCoder and CodeLlama-13b-instruct.
arXiv Detail & Related papers (2023-05-20T08:43:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.