Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects
- URL: http://arxiv.org/abs/2507.14330v1
- Date: Fri, 18 Jul 2025 19:15:50 GMT
- Title: Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects
- Authors: Arshad Beg, Diarmuid O'Donoghue, Rosemary Monahan,
- Abstract summary: VERIFAI1 aims to investigate automated and semi-automated approaches to bridge this gap.<n>This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software correctness is ensured mathematically through formal verification, which involves the resources of generating formal requirement specifications and having an implementation that must be verified. Tools such as model-checkers and theorem provers ensure software correctness by verifying the implementation against the specification. Formal methods deployment is regularly enforced in the development of safety-critical systems e.g. aerospace, medical devices and autonomous systems. Generating these specifications from informal and ambiguous natural language requirements remains the key challenge. Our project, VERIFAI^{1}, aims to investigate automated and semi-automated approaches to bridge this gap, using techniques from Natural Language Processing (NLP), ontology-based domain modelling, artefact reuse, and large language models (LLMs). This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions in the generation of verifiable specifications from informal requirements.
Related papers
- Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny [68.00108157244952]
Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes are neither reliable nor scalable.<n>A promising yet largely uncharted alternative is formal language-based reasoning.<n>Grounding LLMs in rigorous formal systems where generative models operate in formal language spaces (e.g., Dafny) enables the automatic and mathematically provable verification of their reasoning processes and outcomes.
arXiv Detail & Related papers (2025-07-22T08:13:01Z) - Specification-Guided Repair of Arithmetic Errors in Dafny Programs using LLMs [84.30534714651093]
We present an innovative APR tool for Dafny, a verification-aware programming language.<n>We localize faults through a series of steps, which include using Hoare Logic to determine the state of each statement within the program.<n>We evaluate our approach using DafnyBench, a benchmark of real-world Dafny programs.
arXiv Detail & Related papers (2025-07-04T15:36:12Z) - Formalising Software Requirements using Large Language Models [0.0]
The project addresses the challenges in the traceability and verification of formal specifications.<n>It provides support for the automatic generation of the formal specifications and the traceability of the requirements from the initial software design stage through the systems implementation and verification.
arXiv Detail & Related papers (2025-06-12T13:55:01Z) - Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models.<n>DisCIPL uses a Planner model to generate a task-specific inference program.<n>Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z) - Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering [51.31836988300326]
Verifier engineering is a novel post-training paradigm specifically designed for the era of foundation models.
We systematically categorize the verifier engineering process into three essential stages: search, verify, and feedback.
arXiv Detail & Related papers (2024-11-18T12:04:52Z) - Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification is a key element of machine learning applications.<n>We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.<n>We conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z) - Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification [15.686651364655958]
AutoSpec is an automated approach to synthesize specifications for automated program verification.
It overcomes the shortcomings of existing work in specification versatility, synthesizing satisfiable and adequate specifications for full proof.
It can be successfully applied to verify the programs in a real-world X509-parser project.
arXiv Detail & Related papers (2024-03-31T18:15:49Z) - Validation of Rigorous Requirements Specifications and Document
Automation with the ITLingo RSL Language [0.0]
ITLingo initiative has introduced a requirements specification language named RSL to enhance the rigor and consistency of technical documentation.
This paper reviews existing research and tools in the fields of requirements validation and document automation.
We propose to extend RSL with validation of specifications based on customized checks, and on linguistic rules dynamically defined in the RSL itself.
arXiv Detail & Related papers (2023-12-17T21:39:26Z) - Natural Language Processing for Requirements Formalization: How to
Derive New Approaches? [0.32885740436059047]
We present and discuss principal ideas and state-of-the-art methodologies from the field of NLP.
We discuss two different approaches in detail and highlight the iterative development of rule sets.
The presented methods are demonstrated on two industrial use cases from the automotive and railway domains.
arXiv Detail & Related papers (2023-09-23T05:45:19Z) - A General Framework for Verification and Control of Dynamical Models via Certificate Synthesis [54.959571890098786]
We provide a framework to encode system specifications and define corresponding certificates.
We present an automated approach to formally synthesise controllers and certificates.
Our approach contributes to the broad field of safe learning for control, exploiting the flexibility of neural networks.
arXiv Detail & Related papers (2023-09-12T09:37:26Z) - Technical Report on Neural Language Models and Few-Shot Learning for
Systematic Requirements Processing in MDSE [1.6286277560322266]
This paper is based on the analysis of an open-source set of automotive requirements.
We derive domain-specific language constructs helping us to avoid ambiguities in requirements and increase the level of formality.
arXiv Detail & Related papers (2022-11-16T18:06:25Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.