Related papers: Specify What? Enhancing Neural Specification Synthesis by Symbolic Methods

Specify What? Enhancing Neural Specification Synthesis by Symbolic Methods

URL: http://arxiv.org/abs/2406.15540v2
Date: Wed, 18 Sep 2024 08:21:29 GMT
Title: Specify What? Enhancing Neural Specification Synthesis by Symbolic Methods
Authors: George Granberry, Wolfgang Ahrendt, Moa Johansson,
Abstract summary: We investigate how combinations of Large Language Models and symbolic analyses can be used to synthesise specifications of C programs. The method infers rather the programs intent than its behaviour, by generating specifications for buggy programs and observing robustness of the result against bugs.
Score: 1.2152813244704233
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate how combinations of Large Language Models (LLMs) and symbolic analyses can be used to synthesise specifications of C programs. The LLM prompts are augmented with outputs from two formal methods tools in the Frama-C ecosystem, Pathcrawler and EVA, to produce C program annotations in the specification language ACSL. We demonstrate how the addition of symbolic analysis to the workflow impacts the quality of annotations: information about input/output examples from Pathcrawler produce more context-aware annotations, while the inclusion of EVA reports yields annotations more attuned to runtime errors. In addition, we show that the method infers rather the programs intent than its behaviour, by generating specifications for buggy programs and observing robustness of the result against bugs.

Related papers

Seeking Specifications: The Case for Neuro-Symbolic Specification Synthesis [1.2152813244704233]
This work is concerned with the generation of formal specifications from code, using Large Language Models (LLMs) in combination with symbolic methods. For the first, we investigate how the absence or presence of bugs in the code impacts the generated specifications. For the second, we investigate the impact of results from symbolic analyses on the specifications generated by the LLM.
arXiv Detail & Related papers (2025-04-29T10:02:29Z)
AssertionForge: Enhancing Formal Verification Assertion Generation with Structured Representation of Specifications and RTL [6.062811197376495]
We propose a novel approach that constructs a Knowledge Graph (KG) from both specifications and RTL. We create an initial KG from the specification and then systematically fuse it with information extracted from the RTL code, resulting in a unified, comprehensive KG. Experiments on four designs demonstrate that our method significantly enhances SVA quality over prior methods.
arXiv Detail & Related papers (2025-03-24T21:53:37Z)
Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL) We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping. ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z)
When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages [9.138590152838754]
Segment-level quality estimation (QE) is a challenging cross-lingual language understanding task. We comprehensively evaluate large language models (LLMs) in zero/few-shot scenarios. Our results indicate that prompt-based approaches are outperformed by the encoder-based fine-tuned QE models.
arXiv Detail & Related papers (2025-01-08T12:54:05Z)
Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs [45.83950260830323]
This work focuses on harnessing Large Language Models for automated Event Extraction. It introduces a new method to address hallucination by decomposing the task into Event Detection and Event Argument Extraction.
arXiv Detail & Related papers (2024-06-03T06:55:10Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
Improving Input-label Mapping with Demonstration Replay for In-context Learning [67.57288926736923]
In-context learning (ICL) is an emerging capability of large autoregressive language models. We propose a novel ICL method called Sliding Causal Attention (RdSca) We show that our method significantly improves the input-label mapping in ICL demonstrations.
arXiv Detail & Related papers (2023-10-30T14:29:41Z)
Exploring Large Language Models for Code Explanation [3.2570216147409514]
Large Language Models (LLMs) have made remarkable strides in Natural Language Processing. This study specifically delves into the task of generating natural-language summaries for code snippets, using various LLMs.
arXiv Detail & Related papers (2023-10-25T14:38:40Z)
Instruction Position Matters in Sequence Generation with Large Language Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization. We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z)
IERL: Interpretable Ensemble Representation Learning -- Combining CrowdSourced Knowledge and Distributed Semantic Representations [11.008412414253662]
Large Language Models (LLMs) encode meanings of words in the form of distributed semantics. Recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs. We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations.
arXiv Detail & Related papers (2023-06-24T05:02:34Z)
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning [77.7070536959126]
In-context learning (ICL) emerges as a promising capability of large language models (LLMs) In this paper, we investigate the working mechanism of ICL through an information flow lens. We introduce an anchor re-weighting method to improve ICL performance, a demonstration compression technique to expedite inference, and an analysis framework for diagnosing ICL errors in GPT2-XL.
arXiv Detail & Related papers (2023-05-23T15:26:20Z)
Evaluating Step-by-Step Reasoning through Symbolic Verification [20.156768135017007]
Pre-trained language models (LMs) have shown remarkable reasoning performance for in-context learning. LMLP enjoys more than $25%$ higher accuracy than chain-of-thoughts (CoT) on length generalization benchmarks even with smaller model sizes.
arXiv Detail & Related papers (2022-12-16T19:30:01Z)
Examining Scaling and Transfer of Language Model Architectures for Machine Translation [51.69212730675345]
Language models (LMs) process sequences in a single stack of layers, and encoder-decoder models (EncDec) utilize separate layer stacks for input and output processing. In machine translation, EncDec has long been the favoured approach, but with few studies investigating the performance of LMs.
arXiv Detail & Related papers (2022-02-01T16:20:15Z)
Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags. Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z)
ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (system description) [0.4893345190925177]
We describe an implementation of gradient boosting and neural guidance of saturation-style automated theorem provers. For the gradient-boosting guidance, we manually create abstracted features by considering arity-based encodings of formulas. For the neural guidance, we use symbol-independent graph neural networks (GNNs) and their embedding of the terms and clauses.
arXiv Detail & Related papers (2020-02-13T09:44:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.