Related papers: Enhancing Formal Software Specification with Artificial Intelligence

Enhancing Formal Software Specification with Artificial Intelligence

URL: http://arxiv.org/abs/2601.09745v1
Date: Sun, 11 Jan 2026 13:53:22 GMT
Title: Enhancing Formal Software Specification with Artificial Intelligence
Authors: Antonio Abu Nassar, Eitan Farchi,
Abstract summary: Formal software specification is known to enable early error detection and explicit invariants.<n>It has seen limited industrial adoption due to its high notation overhead and the expertise required to use traditional formal languages.<n>Recent advances in artificial intelligence make it possible to retain many of the benefits of formal specification while substantially reducing these costs.
Score: 1.4217721366366873
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Formal software specification is known to enable early error detection and explicit invariants, yet it has seen limited industrial adoption due to its high notation overhead and the expertise required to use traditional formal languages. This paper presents a case study showing that recent advances in artificial intelligence make it possible to retain many of the benefits of formal specification while substantially reducing these costs. The necessity of a clear distinction between what is controlled by the system analyst and can highly benefits from the rigor of formal specification and what need not be controlled is demonstrated. We use natural language augmented with lightweight mathematical notation and written in \LaTeX\ as an intermediate specification language, which is reviewed and refined by AI prior to code generation. Applied to a nontrivial simulation of organizational knowledge growth, this approach enables early validation, explicit invariants, and correctness by design, while significantly reducing development effort and producing a correct implementation on the first attempt.

Related papers

Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning (Extended Version) [49.462399222747024]
We propose a novel framework for the logical specification of non-Markovian rewards in Decision Processes (MDPs) with large state spaces.<n>Our approach leverages Linear Temporal Logic Modulo Theories over finite traces (LTLfMT)<n>We introduce a method based on reward machines and Hindsight Experience Replay (HER) to translate first-order logic specifications and address reward sparsity.
arXiv Detail & Related papers (2026-02-05T22:11:28Z)
Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization [82.29178197694819]
We derive and evaluate development-specific prompt optimization guidelines.<n>We use an iterative, test-driven approach to automatically refine code generation prompts.<n>We conduct an assessment with 50 practitioners, who report their usage of the elicited prompt improvement patterns.
arXiv Detail & Related papers (2026-01-19T15:01:42Z)
Fast Thinking for Large Language Models [67.7238685892317]
We introduce Latent Codebooks for Fast Thinking, a framework that uses concise CoT sketches only during training to learn a codebook of discrete strategy priors.<n>At inference, the model conditions on a handful of continuous thinking switches distilled from the codebook in a single pass, enabling strategy-level guidance without producing explicit reasoning tokens.
arXiv Detail & Related papers (2025-09-28T04:19:48Z)
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny [78.1575956773948]
Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes are neither reliable nor scalable.<n>A promising yet largely uncharted alternative is formal language-based reasoning.<n>Grounding LLMs in rigorous formal systems where generative models operate in formal language spaces (e.g., Dafny) enables the automatic and mathematically provable verification of their reasoning processes and outcomes.
arXiv Detail & Related papers (2025-07-22T08:13:01Z)
Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects [0.0]
VERIFAI1 aims to investigate automated and semi-automated approaches to bridge this gap.<n>This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions.
arXiv Detail & Related papers (2025-07-18T19:15:50Z)
Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian [75.94354349994576]
This paper explores the feasibility of employing smaller, domain-specific encoder LMs alongside prompting techniques to enhance performance in specialized contexts. Our study concentrates on the Italian bureaucratic and legal language, experimenting with both general-purpose and further pre-trained encoder-only models. The results indicate that while further pre-trained models may show diminished robustness in general knowledge, they exhibit superior adaptability for domain-specific tasks, even in a zero-shot setting.
arXiv Detail & Related papers (2024-07-30T08:50:16Z)
Evaluating LLM-driven User-Intent Formalization for Verification-Aware Languages [6.0608817611709735]
We propose a metric for evaluating the quality of specifications for verification-aware languages. We show that our metric agrees closely on a human-labeled dataset of Dafny specifications for the popular MBPP code-generation benchmark. We also outline formal verification challenges that need to be addressed to apply the technique more widely.
arXiv Detail & Related papers (2024-06-14T06:52:08Z)
On-the-Fly Syntax Highlighting: Generalisation and Speed-ups [2.208443815105053]
On-the-fly syntax highlighting is the task of rapidly associating visual secondary notation values with each character of a language derivation. Speed constraints are essential to ensure tool usability, manifesting as responsiveness for end users accessing online source code. achieving precise highlighting is critical for enhancing code comprehensibility. addressing the development costs of such resolvers is imperative, given the multitude of programming language versions.
arXiv Detail & Related papers (2024-02-13T19:43:22Z)
nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models [3.1143846686797314]
We present nl2spec, a framework for applying Large Language Models (LLMs) derive formal specifications from unstructured natural language. We introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization.
arXiv Detail & Related papers (2023-03-08T20:08:53Z)
Lexically-constrained Text Generation through Commonsense Knowledge Extraction and Injection [62.071938098215085]
We focus on the Commongen benchmark, wherein the aim is to generate a plausible sentence for a given set of input concepts. We propose strategies for enhancing the semantic correctness of the generated text.
arXiv Detail & Related papers (2020-12-19T23:23:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.